Quality of Life

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Journal of Multivariate Analysis 100 (2009) 2254–2269

Contents lists available at ScienceDirect

Journal of Multivariate Analysis


journal homepage: www.elsevier.com/locate/jmva

Estimation of a change-point in the mean function of functional dataI


Alexander Aue a,∗ , Robertas Gabrys b , Lajos Horváth c , Piotr Kokoszka b
a
Department of Statistics, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
b
Department of Mathematics and Statistics, Utah State University, 3900 Old Main Hill, Logan, UT 84322-3900 USA
c
Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112-0090, USA

article info abstract


Article history: The paper develops a comprehensive asymptotic theory for the estimation of a change-
Received 8 April 2008 point in the mean function of functional observations. We consider both the case of a
Available online 14 April 2009 constant change size, and the case of a change whose size approaches zero, as the sample
size tends to infinity. We show how the limit distribution of a suitably defined change-
AMS 2000 subject classifications: point estimator depends on the size and location of the change. The theoretical insights
primary 62H25
are confirmed by a simulation study which illustrates the behavior of the estimator in finite
62G05
secondary 62G20
samples.
© 2009 Elsevier Inc. All rights reserved.
Keywords:
Change-point estimation
Mean function
Functional data analysis

1. Introduction

Functional data analysis (FDA) has been enjoying increased attention over the last decade due to its applicability to
problems which are difficult to cast into a framework of scalar or vector observations. Even if such standard approaches
are available, the functional approach often leads to a more natural and parsimonious description of the data, and to more
accurate inference and prediction results, see, for example, [1–10]. Both inferential and exploratory tools of FDA can however
be severely biased if the stochastic structure of the data changes at some unknown point within the sample. In the scalar
context, this issue has received considerable attention, see [11–17], among many others.
The most important change that can occur in the functional context is the change of the mean function. This paper
investigates large sample properties of an estimator of such a change-point. We consider both the case of a fixed size change
and a contiguous change whose size approaches zero as the sample size increases. Specifically, we assume that the functional
observations X1 , . . . , Xn are defined on a compact set T and follow the model

Xi = µ + ∆I{i > k∗ } + Yi , i = 1, . . . , n, (1.1)

where µ and ∆ 6= 0 are unknown, square integrable and deterministic functions over T , and Y1 , . . . , Yn are independent,
identically distributed zero mean random elements of L2 (T ) with covariance function

K (s, t ) = E[Y1 (s)Y1 (t )], s, t ∈ T ,

I Research partially supported by NSF grants DMS-0413653, DMS 0604670, DMS 0652420 and DMS-0804165.

Corresponding author.
E-mail addresses: [email protected] (A. Aue), [email protected] (R. Gabrys), [email protected] (L. Horváth),
[email protected] (P. Kokoszka).

0047-259X/$ – see front matter © 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.jmva.2009.04.001
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2255

E[Y12 (t )]dt < ∞. The unknown integer k∗ ∈ {1, . . . , n} is called the change-point. We assume
R
satisfying E[kY1 k2 ] = T
that
k∗ = bθ nc with some fixed θ ∈ (0, 1]. (1.2)
Model (1.1) describes a sequence of functional observations which suffer from a mean change if k < n or, equivalently, ∗

if θ < 1. The corresponding hypothesis testing problem


H0 : k∗ = n vs HA : k∗ < n
has been addressed in [18]. To explain their results and present our contribution, we must state several consequences of the
assumptions made so far. First, Mercer’s theorem (see Chapter 4 of [19]) implies that, under the null hypothesis, there is a
spectral decomposition for the covariance operator K (s, t ), namely

X
K (s, t ) = λ` ϕ` (s)ϕ` (t ), s, t ∈ T ,
`=1
where λ` and ϕ` denoteR the eigenvalues and eigenfunctions of K (s, t ), respectively. These can be obtained as the solutions of
the equation system T K (s, t )ϕ` (t )dt = λ` ϕ` (s) with s, t ∈ T . Since the eigenfunctions form a complete orthonormal basis
in L2 (T ) and all eigenvalues of K (s, t ) are non-negative, they lead to the the Karhunen–Loéve representation (in L2 (T ),
not pointwise in t ∈ T )
∞ p
X
Yi (t ) = λ` ρi,` ϕ` (t ), t ∈ T , i = 1, . . . , n,
`=1

λ` ρi,` = T Yi (t )ϕ` (t )dt is called the `th functional principal component score. It is also implied that the sequences
R
where
(ρi,` )`≥1 consist of uncorrelated random variables with zero mean and unit variance and that, for i 6= j, (ρi,` )`≥1 and (ρj,` )`≥1
are independent.
For the statistical analysis, the population eigenvalues and eigenfunctions have to be replaced by their estimated versions.
These are based on the estimated covariance operator
n
1X
K̂ (s, t ) = [Xi (s) − X̄n (s)][Xi (t ) − X̄n (t )], (1.3)
n i=1

where X̄n = n−1 (X1 + · · · + Xn ). From this, estimated eigenvalues λ̂` and eigenfunctions ϕ̂` can then be derived as the
solutions of the equations
Z
K̂ (s, t )ϕ̂` (t )dt = λ̂` ϕ̂` (s).
T
We make the assumption that, for some fixed d > 0,
λ1 > λ2 > · · · > λd > λd+1 ≥ 0, (1.4)
which together with the assumption of finite fourth moment of the Yi guarantees that the estimated and population
eigenvalues and eigenfunctions are sufficiently close under H0 , see Chapter 4 of [20] and [21].
The hypothesis test for H0 versus HA in [18] is based on the projection of the functions X̄bnxc − X̄n , x ∈ (0, 1), on the space
spanned by the first d estimated eigenfunctions ϕ̂1 , . . . , ϕ̂d . The corresponding estimated scores are
Z
η̂i,` = [Xi (t ) − X̄n (t )]ϕ̂` (t )dt .
T
Berkes et al. [18] introduced and motivated the test statistic
!2
d n k n
1 X 1 X X kX
Sn,d = η̂i,` − η̂i,`
n2 `=1 λ̂` k=1 i=1
n i=1

and established its limit distribution under the null hypothesis, as well as its consistency under the alternative. For the
convenience of the reader, these results are stated as a theorem.

Theorem 1.1. Let E[kY1 k4 ] < ∞. Then, it holds under H0 that


d Z 1
D
X
Sn,d −→ B2` (x)dx (n → ∞),
`=1 0

D
where −→ indicates convergence in distribution and (B` (x): x ∈ [0, 1]), 1 ≤ ` ≤ d, denotes independent standard Brownian
P
bridges. If ∆ is not orthogonal to the subspace spanned by the eigenfunctions ϕ1 , . . . , ϕd , then it holds under HA that Sn,d −→ ∞
as n → ∞.
2256 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

While the theorem guarantees in its second part that Sn,d will eventually detect a change given that there are sufficiently
many observations, it does not contain information on how to locate the change-point, and what the distributional properties
of an appropriate estimator are. The main aim of the present paper is therefore to introduce an estimator k̂∗n for k∗ and to
derive its limit distribution under different assumptions on the function ∆ which determines the type of change. This will be
done in Section 2. In Section 3, we evaluate the finite sample behavior via a small simulation study. All proofs are relegated
to Section 4.

2. Change-point estimator and its limit distribution

It is assumed throughout this section that the alternative hypothesis HA holds true. Letting xT denote the transpose of a
vector x, define η̂i = (η̂i,1 , . . . , η̂i,d )T and the diagonal matrix Σ̂ = diag(λ̂` : ` = 1, . . . , d). Introducing the quantities
k n
X kX
κ̂n (k) = η̂i − η̂i
i=1
n i =1
and the quadratic forms
1
Q̂n (k) = κ̂n (k)Σ̂ −1 κ̂n (k),
T
n
a suitable estimator for k∗ is given by
 
k̂n = min k : Q̂n (k) = max Q̂n (j) .

(2.1)
1≤j≤n

With this procedure, we select as change-point the time k that maximizes the random quadratic form Q̂n (k) which is directly
R1
linked to the test statistic Sn,d from the previous section via the equality Sn,d = 0 Q̂n (bnxc)dx. Because Q̂n (k) lives on the
subspace spanned by the first d estimated eigenfunctions ϕ̂1 , . . . , ϕ̂d of the covariance operator K̂ (s, t ), we need to determine
the behavior of K̂ (s, t ) under HA . Due to the additional ∆ appearing after the change-point k∗ , it cannot be expected that
K̂ (s, t ) provides an estimator for K (s, t ) anymore. Indeed, the following holds true instead. If we let
KA (s, t ) = K (s, t ) + θ (1 − θ )∆(t )∆(s), s, t ∈ T ,
then KA (s, t ) is symmetric, square integrable and positive-definite, so it admits a representation

X
KA (s, t ) = γ` ψ` (s)ψ` (t )
`=1

with eigenfunctions ψ` and eigenvalues γ` obtained from solving the system KA (s, t )ψ` (t )dt = γ` ψ` (s). The relation
R
T
between the pairs (γ` , ψ` ) and (λ̂` , ϕ̂` ) is established in Proposition 2.1 whose proof is given in [18].

Proposition 2.1. Under HA it holds that, for all 1 ≤ ` ≤ d,


(i) |λ̂` − γ` | = oP (1) as n → ∞ and
(ii) kϕ̂` − ĉ` ψ` k = oP (1) as n → ∞,
ψ` (t )ϕˆ` (t )dt.
R
where ĉ` = sign T

The proposition identifies γ` and ψ` (up to a sign) as the stochastic limits of their estimated versions λ̂` and ϕˆ` . As a
consequence, it implies that the limit distribution of k̂∗n depends on the behavior of the projection of ∆ on the subspace
spanned by the eigenfunctions ψ1 , . . . , ψd . For 1 ≤ ` ≤ d, denote by
Z Z
√ √
ζi,` = γ` ξi,` = Yi (t )ψ` (t )dt and β` = γ` δ` = ∆(t )ψ` (t )dt
T T
the principal component scores and set
ζ i = (ζi,1 , . . . , ζi,d )T , ξ i = (ξi,1 , . . . , ξi,d )T , δ = (δ1 , . . . , δd )T .
We distinguish two cases
δ 6= 0 is constant (2.2)
and
δ = δn 6= 0 such that kδn k2 → 0 (n → ∞), (2.3)
where k · k2 denotes Euclidean norm on Rd . Assumptions (2.2) and (2.3) reflect two common approaches to deriving an
asymptotic distribution of change-point estimators, see for example, [22] and the references therein.
We first state the result for the case (2.2).
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2257

Theorem 2.1. Let E[kY1 k4 ] < ∞. If δ 6= 0 is constant, then it holds under HA that
D
k̂∗n − k∗ −→ min k : P (k) = sup P (j) (n → ∞),

j

where

(1 − θ )kδk22 k + δT Sk if k < 0,
P (k) = 0 if k = 0,
−θkδk22 k + δT Sk if k > 0,

with Sk defined by
k
X −1
X
Sk = ξi + ξi , −∞ < k < ∞.
i=1 i=−k

Here (ξ −i ) denotes an independent copy of (ξ i ) and, as usual, an empty sum is set to equal zero.

Since δ does not vary with the number of observations, it appears naturally also in the limit variable, which is given as
the argument of the maximum of a two-sided sequence of random variables with drift.
A corresponding result holds true for the case (2.3). It is stated next.

Theorem 2.2. Let E[kY1 k4 ] < ∞. If δ = δn 6= 0 is such that

kδn k2 → 0, but nkδn k22 → ∞ (n → ∞),

then it holds under HA that


 D
kδn k22 k̂∗n − k∗ −→ min t : V (t ) = sup V (s) (n → ∞),

s

where
(1 − θ )t + W (t ) if t ≤ 0,

V (t ) =
−θ t + W (t ) if t > 0,

with (W (t ): −∞ < t < ∞) denoting a two-sided standard Brownian motion.

Note that the limit processes P (k) and V (t ) contain drift terms which attain their maximum at 0, and whose slope on
the negative and positive half line is determined by the location θ of the change-point. If θ = 1/2, then the drift parts are
symmetric, while the change-point detection becomes significantly harder if θ is close to 0 (or 1). In these cases, the slope of
the drift for positive (or negative) arguments is close to zero. In the case of Theorem 2.1, the constant order of magnitude of
kδk2 also plays a role, with larger changes naturally being more easily identifiable. Theorems 2.1 and 2.2 thus provide clear
theoretical justification of the empirical properties discussed in Section 3.
It is possible to develop a feel for the size of the function ∆ = ∆n which implies the assumptions of Theorem 2.2. If
k∆n k → 0, then kK̂ − K k → 0, so by inequalities (4.38) and (4.44) of [20], kϕ̂` − ĉ` ϕ` k → 0 and λ̂` → λ` in probability. In
view of Proposition 2.1, we have that eigenvalues and eigenfunctions under H0 and HA coincide in the limit. This means that
2
δn,` ≈ cn,` λ− 1
∆n (t )ϕ` (t )dt and so kδn k2 ≈ d`=1 λ`−2 T ∆n (t )ϕ` (t )dt . Thus, by the Cauchy–Schwartz inequality,
R P R
` T
k∆n k → 0 implies kδn k → 0. A sufficient condition for nkδn k2 → ∞ cannot be stated as easily, but it is roughly
2
nk∆n k2 → ∞ because by Parseval’s inequality, ∆2n (t )dt ≈ ∆n (t )ϕ` (t )dt . These approximate calculations
R Pd R
`=1 T
could be formalized, but our goal is to merely indicate that Theorem 2.2 holds if k∆n k tends to zero at the rate slower than
n−1/2 .
Finally, we discuss the consistency of the estimator. Observe that we have assumed in (2.2) and (2.3) that δ 6= 0. This
means that there exists 1 ≤ ` ≤ d such that T ∆(t )ψ` (t )dt 6= 0. If instead the change function ∆ is orthogonal to
R
ψ1 , . . . , ψd , that is if
Z 1
∆(t )ψ` (t )dt = 0 for all ` = 1, . . . , d,
0

then k̂∗n cannot be a consistent estimator of k∗ , since the principal component analysis has been performed in an eigenspace
with a too small dimension to capture the change. On the other hand, see e.g. Chapter 8 of [23], using large d is not practical
because it bears the difficulty of interpreting a multitude of principal components. Moreover, since for large ` the eigenvalues
λ` are generally very small (the λ` are arranged in decreasing order), the corresponding ψ` explains only a very small part
of the variability of the data. Therefore the impact of a change occurring in a subspace spanned by the ψ` with large ` is
small, and its detection less crucial.
2258 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

Table 1
Summary statistics for the change-point estimator. The change-point processes were generated by combining BB and t + BB for three different locations
of the change-point τ ∗ . We used d = 2 and d = 3 (in parenthesis).
τ∗ Average (τ̂ ) Bias (τ̂ ) Median (τ̂ ) RMSE (τ̂ ) MAE (τ̂ )

n = 60
0.25 0.27 (0.26) 0.0152 (0.0107) 0.25 (0.25) 0.0336 (0.0252) 0.0158 (0.0108)
0.50 0.50 (0.50) 0.0002 (−0.0003) 0.50 (0.50) 0.0108 (0.0058) 0.0038 (0.0018)
0.75 0.73 (0.74) −0.0152 (−0.0087) 0.75 (0.75) 0.0356 (0.0205) 0.0157 (0.0088)
n = 100
0.25 0.26 (0.26) 0.0096 (0.0052) 0.25 (0.25) 0.0220 (0.0122) 0.0101 (0.0053)
0.50 0.50 (0.50) 0.0002 (0.0000) 0.50 (0.50) 0.0063 (0.0039) 0.0024 (0.0011)
0.75 0.74 (0.74) −0.0096 (−0.0052) 0.75 (0.75) 0.0215 (0.0155) 0.0100 (0.0063)
n = 140
0.25 0.26 (0.25) 0.0062 (0.0039) 0.25 (0.25) 0.0141 (0.0096) 0.0064 (0.0040)
0.50 0.50 (0.50) −0.0001 (−0.0001) 0.50 (0.50) 0.0043 (0.0027) 0.0017 (0.0007)
0.75 0.74 (0.75) −0.0071 (−0.0039) 0.75 (0.75) 0.0147 (0.0093) 0.0068 (0.0040)
n = 200
0.25 0.25 (0.25) 0.0046 (0.0030) 0.25 (0.25) 0.0107 (0.0070) 0.0050 (0.0031)
0.50 0.50 (0.50) 0.0001 (0.0000) 0.50 (0.50) 0.0033 (0.0016) 0.0013 (0.0005)
0.75 0.75 (0.75) −0.0050 (−0.0023) 0.75 (0.75) 0.0110 (0.0062) 0.0052 (0.0024)
n = 300
0.25 0.25 (0.25) 0.0030 (0.0018) 0.25 (0.25) 0.0066 (0.0047) 0.0032 (0.0019)
0.50 0.50 (0.50) 0.0000 (0.0001) 0.50 (0.50) 0.0021 (0.0012) 0.0008 (0.0004)
0.75 0.75 (0.75) −0.0032 (−0.0018) 0.75 (0.75) 0.0079 (0.0048) 0.0034 (0.0019)
n = 600
0.25 0.25 (0.25) 0.0015 (0.0007) 0.25 (0.25) 0.0036 (0.0019) 0.0016 (0.0008)
0.50 0.50 (0.50) 0.0000 (0.0000) 0.50 (0.50) 0.0010 (0.0006) 0.0004 (0.0002)
0.75 0.75 (0.75) −0.0015 (−0.0009) 0.75 (0.75) 0.0037 (0.0022) 0.0016 (0.0009)

3. Finite sample behavior

We carried out simulations to illustrate our theoretical results in finite samples. We simulated change-point processes
under the conditions of Theorems 2.1 and 2.2 for different sample sizes, and always used 1000 replications. For each
replication we estimated the location of a change-point k∗ . We generated functional observations according to (1.1). Without
loss of generality, µ was chosen to be equal to zero. Two different cases of Yi were considered, namely the trajectories of
the standard Brownian motion (BM), and the Brownian bridge (BB). The number d of the principal components was chosen
to be equal to 2 and 3 in order to explain at least 75% of variability. The properties of the sampling distributions of the
change-point estimator k̂∗n are now briefly discussed.
To illustrate the simulation results based on Theorem 2.1 we introduced the quantity τn∗ = k∗n /n and the corresponding
estimator τ̂n∗ = k̂∗n /n. We concentrated on τ̂n∗ − τ ∗ rather than on k̂∗n − k∗ to show the effect of the increase in sample size

more clearly. Various functions ∆ were analyzed: ∆ = t , t 2 , t , exp(t ), sin(t ), and cos(t ). To assess the accuracy of the
estimator, bias, root mean square error (RMSE), and mean absolute error (MAE) of τ̂n∗ were computed. To conserve space,
we do not display the whole set of tables we obtained, but rather display representative results in Table 1, and discuss general
findings. From Table 1 we see that by increasing the sample size we attain a smaller bias, RMSE, and MAE. A similar pattern is
observed for the increase in the number of principal components. In all cases we considered, the summary statistics indicate
that estimation is more accurate if BB was used, even though the same number of principal components explains more
variability for BM. This is easy to understand because the BB is a ‘‘smaller’’ process in the sense that E[kBBk2 ] = 1/6 and
E[kBMk2 ] = 1/4, so the same change function ∆ is more pronounced if the Yi are the BB. As expected from the discussion
following Theorem 2.2, the closer the change-point is to the middle of the sample, the better the estimator is. For τ ∗ equal
to 0.25 and 0.75 an increased bias is observed.
Next we illustrate Theorem 2.2 which deals with nonconstant ∆. We chose ∆ = ∆n satisfying conditions of Theorem √ 2.2
nα nα nα
and carried out the change-point estimation. Several different forms of ∆n were considered, namely sin(t ) √ n
,t√ n
, t√n
,
α α
cos(t ) √
n n
, where α ∈ (0, 0.5). To illustrate Theorem 2.2, we concentrated on the distribution of kδn k22 k̂∗n − k∗ . We

n
, et √ n

computed δ` from γ` δ` = T ∆(t )ψ` (t )dt, where, for ` = 1, . . . , d,
R

√ 2` + 1
 
4
ψ` (t ) = 2 sin πt , t ∈ [0, 1], and γ` =
2 [π (2` + 1)]2
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2259

Change point at 1n/4

1.5
n=100 n=600

1.0
n=300 n=900

0.5
0.0

–0.5 0.0 0.5 1.0 1.5


Change point at 2n/4
1.5

n=100 n=600
1.0

n=300 n=900
0.5
0.0

–1.0 –0.5 0.0 0.5 1.0

Change point at 3n/4


1.5

n=100 n=600
1.0

n=300 n=900
0.5
0.0

–1.5 –1.0 –0.5 0.0 0.5

0.05
 
Fig. 1. Estimated density of kδn k22 k̂∗n − k∗ for the process obtained combining BM and t n√n + BM.

are the eigenfunctions and eigenvalues of the BM and


√ 1
ψ` (t ) = 2 sin (`π t ) , t ∈ [0, 1], and γ` =
[π `]2

are the corresponding eigenfunctions and eigenvalues of the BB.


As before, we chose k∗n to be the lower, middle and upper quartile of the sample size. The graphs of the estimated density

of kδn k22 k̂∗n − k∗ are shown in Figs. 1 and 2. The densities are close to each other, as Theorem 2.2 implies that they must be
close to the limit distribution. In most cases, a convergence with increasing n is also clearly visible. For example, in the top
and middle panels of Fig. 2, the densities for n = 600 and n = 900 almost coincide. These properties hold for all choices of
α ∈ (0, 0.5), Figs. 1 and 2 show the extreme cases of α = 0.05 and α = 0.45.

4. Proofs

The proof section is divided into three parts. In the first subsection, we derive a decomposition that will be used to derive
Theorems 2.1 and 2.2, whose proofs will be pursued in Sections 4.2 and 4.3, respectively.

4.1. Preliminary calculations

Let R̂n (k) = Q̂n (k) − Q̂n (k∗ ). Since R̂n (k) and the original Q̂n (k) differ only by the value Q̂n (k∗ ) which is independent of k,
it holds that they attain their maximum for the same value of k. Consequently, we have
 
k̂n = min k : R̂n (k) = max R̂n (j) .

1≤j≤n

q q
Denote by ζ̂i,` = λ̂` ξ̂i,` = T Yi (t )ϕ̂` (t )dt and β̂` = λ̂` δ̂` = T ∆(t )ϕˆ` (t )dt the counterparts of ζi,` and β` which are
R R

obtained by replacing the true eigenvalues and eigenfunctions with the estimated versions. Note that the quantities ζ̂i,` , ξ̂i,` ,
2260 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

Change point at 1n/4

0.00 0.05 0.10 0.15 0.20 0.25


n=100 n=600
n=300 n=900

–4 –2 0 2 4
Change point at 2n/4
0.4

n=100 n=600
0.3

n=300 n=900
0.2
0.1
0.0

–4 –2 0 2 4
Change point at 3n/4
0.00 0.05 0.10 0.15 0.20 0.25

n=100 n=600
n=300 n=900

–4 –2 0 2 4

0.45
 
Fig. 2. Estimated density of kδn k22 k̂∗n − k∗ for the process obtained combining BB and sin(t ) n√n + BB.

β̂` and δ̂` are unobservable. The proofs to come will fall back on the following decomposition of R̂n (k). First, we have for
1 ≤ k < k∗ that
!2 !2
d k n d k∗ n
1X X kX n − k∗ 1X X k∗ X ∗n−k

R̂n (k) = ξ̂i,` − ξ̂i,` − k δ̂` − ξ̂i,` − ξ̂i,` − k δ̂`
n `=1 i=1
n i =1 n n `=1 i=1
n i =1 n
k∗
!
d n
1X X k − k∗ X n − k∗
= − ξ̂i,` − ξ̂i,` − (k − k∗ ) δ̂`
n `=1 i=k+1
n i =1
n
k∗
!
k n
X X k + k∗ X n − k∗
× ξ̂i,` + ξ̂i,` − ξ̂i,` − (k + k ) ∗
δ̂`
i =1 i =1
n i =1
n

d
1 X  (1) (1)

(2) (2)

= Êk,` + D̂k,` Êk,` + D̂k,` , (4.1)
n `=1

where
k∗ n
(1)
X k − k∗ X (1) n − k∗
Êk,` = − ξ̂i,` − ξ̂i,` , D̂k,` = −(k − k∗ ) δ̂` ,
i=k+1
n i=1
n
k k∗ n
(2)
X X k + k∗ X (2) n − k∗
Êk,` = ξ̂i,` + ξ̂i,` − ξ̂i,` , D̂k,` − (k + k∗ ) δ̂` .
i=1 i=1
n i=1
n

A similar expression can be obtained if k∗ < k ≤ n. Here it holds,


!2 ∗ !2
d k n d k n
1X X kX k∗ 1X X k∗ X k∗
R̂n (k) = ξ̂i,` − ξ̂i,` − (n − k) δ̂` − ξ̂i,` − ξ̂i,` − (n − k ) ∗
δ̂`
n `=1 i=1
n i =1 n n `=1 i=1
n i=1 n
!
d k n
1X X k − k∗ X k∗
= − ξ̂i,` − ξ̂i,` + (k − k ) ∗
δ̂`
n `=1 k ∗ +1
n i=1
n
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2261

k∗
!
k n
X X k + k∗ X k∗
× ξ̂i,` + ξ̂i,` − ξ̂i,` − (2n − k − k∗ ) δ̂`
i=1 i=1
n i=1
n

d
1 X  (3) (3)

(4) (4)

= Êk,` + D̂k,` Êk,` + D̂k,` , (4.2)
n `=1
(3) (4) (3) (4)
where Êk,` , Êk,` and D̂k,` , D̂k,` are defined correspondingly. Using (4.1) and (4.2), we proceed with the proof of Theorem 2.1
in the next subsection. Since the arguments to be employed are symmetric for time lags before and after the change-point,
detailed expositions will only be given for 1 ≤ k < k∗ .

4.2. Proof of Theorem 2.1

The proof is divided into two parts. At first, we show that the estimator k̂∗n will be close to k∗ by showing that R̂n (k) will
attain its maximum not too far from the change-point. In the second step, we will derive the limit distribution.

Lemma 4.1. Under the assumptions of Theorem 2.1, it holds that

k̂∗n − k∗ = OP (1) (n → ∞).

Proof. To show the assertion of the lemma, we determine the behavior of those k satisfying 1 ≤ k ≤ k∗ −N or k∗ +N ≤ k ≤ n
for some N ≥ 1. Let 1 ≤ ` ≤ d. At first, we derive the order of magnitude of the estimated deterministic term in (4.1), that
(1) (2)
is, of 1n D̂k,` D̂k,` . To this end, note that
2
k + k∗ n − k∗

max → 2θ (1 − θ )2 (n → ∞).
1≤k≤k∗ −N n n
q

In the next step, we shall replace δ̂` by δ` . To do so, observe that β̂` = λ̂` δ̂` and β` = γ` δ` by definition. Moreover, part
(ii) of Proposition 2.1 states that kϕ̂` − ĉ` ψ` k → 0 in probability. Therefore,
Z Z
β̂` = ∆(t )ϕ̂` (t )dt = ĉ` ∆(t )ψ` (t )dt + oP (1) = ĉ` β` + oP (1) (n → ∞),
T T

using that ∆(t ) ∈ L2 (T ). Consequently, β̂`2 = β`2 + oP (1). Since the estimated eigenvalues λ̂` converge in probability to γ`
(see part (i) of Proposition 2.1), we arrive at

δ̂`2 = δ`2 + oP (1) (n → ∞). (4.3)


Combining the above arguments yields
2
k + k∗ n − k∗

1 (1) (2)
max D̂k,` D̂k,` = max (k − k∗ )δ`2 + oP (1) = −2δ`2 θ (1 − θ )2 N + oP (1).
1≤k≤k∗ −N n 1≤k≤k∗ −N n n
It is shown in the Appendix that this deterministic part is the dominating term in (4.1). It follows thus that, for all K > 0,
 
lim lim sup P max R̂n (k) > −K = 0. (4.4)
N →∞ n→∞ 1≤k≤k∗ −N

On the other hand, using (4.2), it can be proved in a similar fashion that
1 (3) (4)
max D̂ D̂ = −2θ 2 (1 − θ )N + oP (1),
k∗ +N ≤k≤n n k,` k,`
which implies that, for all K > 0,
 
lim lim sup P max R̂n (k) > −K = 0. (4.5)
N →∞ n→∞ k∗ +N ≤k≤n

Eqs. (4.4) and (4.5) now yield that


 
k̂∗n < k∗ − N ∪ k̂∗n > k∗ + N = 0,

lim lim sup P
N →∞ n→∞

which consequently finishes the proof of the lemma. 


2262 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

To derive the limit distribution, it suffices to investigate the asymptotic behavior of R̂n (k) for the range k∗ −N ≤ k ≤ k∗ +N
of those time lags close to the change-point. The result is presented as a lemma.

Lemma 4.2. Under the assumptions of Theorem 2.1, it holds that, for any N ≥ 1,
D
R̂n (k + k∗ ) : −N ≤ k ≤ N −→ 2θ (1 − θ )P (k) : −N ≤ k ≤ N (n → ∞).
 

Proof. Let 1 ≤ ` ≤ d. Using (4.3), it is easy to see that, for any fixed N ≥ 1 and as n → ∞,
2
k + k∗ n − k∗

1 (1) (2)
max D̂k,` D̂k,` − 2θ (1 − θ )2 δ`2 (k − k∗ ) = δ`2 N max − 2θ (1 − θ )2 + oP (1)
k∗ −N ≤k≤k∗ n ∗
k −N ≤k≤k∗ n n
= oP (1).
By a similar argument,
1 (3) (4)
max D̂ D̂ + 2θ 2 (1 − θ )δ`2 (k − k∗ ) = oP (1) (n → ∞).
k∗ ≤k≤k∗ +N n k,` k,`
In the following, we are dealing with the estimated random parts. The functional central limit theorem implies that, for all
x ∈ [0, 1],
bnxc
1 X d
√ ζ i −→ Γ (x) (n → ∞),
n i =1

d
where −→ indicates weak convergence in the Skorohod space Dd [0, 1] and (Γ (x) : x ∈ [0, 1]) is an Rd -valued, zero mean
stochastic process with covariance matrix Σ . Then,
bnxc bnxc Z bnxc
1 X X 1 X
ζi,` − ζ̂i,` = sup Yi (t ) ĉ` ψ` (t ) − ϕ̂` (t ) dt
 
sup √ √
x∈(0,1) n i=1 i=1 x∈(0,1) T n i=1
 #2 1/2
Z " bnxc Z 1/2
1 X 2
Yi (t ) ĉ` ψ` (t ) − ϕ̂` (t )

≤ sup  √ dt  dt
x∈(0,1) T n i =1 T

= oP (1) (4.6)

by an application of Proposition 2.1. The same statement holds true also if ξi,` and ξ̂i,` are used in place of ζi,` and ζ̂i,` .
Eqs. (4.3) and (4.6) imply now that
! !
n n
k − k∗ X k + k∗ n − k∗ k − k∗ X k + k∗ n − k∗
 
max ξ̂i,` δ̂` = max ξi,` δ` + oP (1)
k∗ −N ≤k≤k∗ n i=1
n n k∗ −N ≤k≤k∗ n i=1
n n
n Z
N X
= O (1) Yi (t )ψ` (t )dt + oP (1)
n i =1 T

= oP (1).
Hence,
k∗
1 (1) (2) X
max Êk,` D̂k,` + 2θ (1 − θ )δ` ξi,` = oP (1)
k∗ −N ≤k≤k∗ n i=k+1

(3) (4)
as n → ∞ for any N ≥ 1 which follows from (4.3) and (4.6) as well. Similar arguments apply also to 1n Êk,` D̂k,` for which
k∗ ≤ k ≤ k∗ + N holds. In view of the definition of the limit process P (k) in Theorem 2.1, it suffices to verify that the
remaining terms in (4.1) and (4.2) do not contribute asymptotically. To this end, write
k∗ k∗
! !
n k n
1 (1) (2) 1 X k − k∗ X X X k + k∗ X
max Êk,` Êk,` = max ξ̂i,` + ξ̂i,` ξ̂i,` + ξ̂i,` − ξ̂i,`
k∗ −N ≤k≤k∗ n k∗ −N ≤k≤k∗ n i=k+1
n i =1 i =1 i =1
n i =1

k∗ n k k∗ n
X k − k∗ X 1 X X k + k∗ X
≤ max ξ̂i,` + ξ̂i,` max ξ̂i,` + ξ̂i,` − ξ̂i,`
k∗ −N ≤k≤k∗
i=k+1
n i =1
k∗ −N ≤k≤k∗ n i=1 i=1
n i=1

= oP (1).
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2263

Pk∗
Here, the first maximum is OP (1), since the first sum i=k+1 ξ̂i,` contains at most N terms, while the second sum is oP (1)
because of (4.6). Another application of (4.6) gives that the second maximum is oP (1). Moreover,
k∗
!
k n
1 (1) (2) 1 n − k∗ X X k + k∗ X
max D̂k,` Êk,` = max (k − k ) ∗
δ̂` ξ̂i,` + ξ̂i,` − ξ̂i,`
k∗ −N ≤k≤k∗ n k∗ −N ≤k≤k∗ n n i=1 i=1
n i=1

k k∗ n
n − k∗ 1 X X k + k∗ X
≤ max (k − k ) ∗
δ̂` max ξ̂i,` + ξ̂i,` − ξ̂i,`
k∗ −N ≤k≤k∗ n k∗ −N ≤k≤k∗ n i =1 i=1
n i=1

= oP (1).
The same arguments apply also to the remaining terms in (4.2) and the proof of the lemma is therefore complete. 

Proof of Theorem 2.1. The assertion follows immediately from Lemmas 4.1 and 4.2. 

4.3. Proof of Theorem 2.2

We follow the proof steps developed in the previous subsection.

Lemma 4.3. Under the assumptions of Theorem 2.2, it holds that

kδn k2 k̂∗n − k∗ = OP (1) (n → ∞).

(1) (2)
Proof. At first, we derive the order of magnitude of 1n D̂k,` D̂k,` in (4.1). Let N ≥ 1 and define Nδ = N kδn k− 2
2 . Recognizing that
n−1 Nδ → 0, since by assumption nkδn k22 → ∞, it follows that
2 2
k + k∗ n − k∗ 2k∗ − Nδ n − k∗
 
max = → 2θ (1 − θ )2 (n → ∞).
1≤k≤k∗ −Nδ n n n n

Consequently, (4.3) yields


d 2 X
d
k + k∗ n − k∗

1 X (1) (2)
max D̂k,` D̂k,` = max (k − k∗ ) δ̂`2
1≤k≤k∗ −Nδ n `=1 1≤k≤k∗ −Nδ n n `=1
2 X
d
k + k∗ n − k∗

= max (k − k ) ∗
δ`2 + oP (1)
1≤k≤k∗ −Nδ n n `=1

= −2θ (1 − θ )2 N + oP (1).
It is shown in Appendix B that, under the assumptions of Theorem 2.2, this deterministic part is the dominating contributor
in (4.1). It follows thus that, for all K > 0,
 
lim lim sup P max R̂n (k) > −K = 0. (4.7)
N →∞ n→∞ 1≤k≤k∗ −Nδ

Moreover, utilizing the decomposition in display (4.2), it can be proved similarly that
d
1 X (3) (4)
max D̂k,` D̂k,` = −2θ 2 (1 − θ )N + oP (1),
k∗ +Nδ ≤k≤n n
`=1

which implies that, for all K > 0,


 
lim lim sup P max R̂n (k) > −K = 0. (4.8)
N →∞ n→∞ k∗ +Nδ ≤k≤n

Eqs. (4.7) and (4.8) now yield that


 
k̂∗n < k∗ − Nδ ∪ k̂∗n > k∗ + Nδ = 0,

lim lim sup P
N →∞ n→∞

which, noticing the definition of Nδ , completes the proof of the lemma. 


2264 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

Lemma 4.4. Under the assumptions of Theorem 2.2, it holds that, for any N ≥ 1,
d
R̂n k∗ + t kδn k− 2
: t ∈ [−N , N ]} −→ {2θ (1 − θ )V (t ) : t ∈ [−N , N ]} (n → ∞),
  
2

d
where −→ indicates weak convergence in the Skorohod space D[−N , N ].

Proof. Denote by k the integer part of t kδn k− 2


2 . Then, as n → ∞,

d
1 X (1) (2)
D̂k∗ +k,` D̂k∗ +k,` + 2θ (1 − θ )2 t = OP (1) sup t − kδn k22 t kδn k− 2
= oP (1).
 
sup 2
t ∈[−N ,0] n `=1 t ∈[−N ,0]

Similarly,
d
1 X (3) (4)
sup D̂k∗ +k,` D̂k∗ +k,` − 2θ 2 (1 − θ )t = oP (1) (n → ∞).
t ∈[0,N ] n `=1

Note next that, after an application of (4.6) and the central limit theorem for partial sums,
s
d n d X n
|t | X X 1 X 1
sup 2
δ` ξi,` = OP (1) ξi,` = OP (1) 2
= oP (1)
t ∈[−N ,0] nkδn k2 `=1 i =1
n kδn k2 `=1 i=1
n kδn k2

by assumption on δn . It follows from the weak convergence of partial sums of random vectors that
   
k∗ −bt /kδn k22 c
D
X X
kδn k2 ξi,` : t ∈ [−N , 0], ` = 1, . . . d = kδn k2 ξi,` , t ∈ [−N , 0], ` = 1, . . . , d
i=k∗ +bt /kδn k22 c+1 i=1

D [−N ,0]
−→ (W` (−t ): t ∈ [−N , 0], ` = 1, . . . , d) ,
where (W` (t ): t ≥ 0), ` = 1, . . . , d, are independent standard Brownian motions. Through a check of the finite-dimensional
distributions one can easily verify that
!
d
1 X
W` (t ): t ≥ 0
kδn k2 `=1
Pk∗ +k+1
is a standard Brownian motion. Similar arguments can be used also for kδn k2 i=k∗ +1 ξi,` , ` = 1, . . . , d. We note
furthermore that
k∗ k∗ +k+1
! !
X X
ξi,` : k < 0, ` = 1, . . . , d and ξi,` : k ≥ 0, ` = 1, . . . , d
i=k∗ +k+1 i=k∗ +1

are independent. Thus, by the Skorohod–Dudley–Wichura representation theorem (see [24]), for each n, there are two
(1) (2)
independent Brownian motions (Wn (t ): t ≥ 0) and (Wn (t ): t ≥ 0) such that
d
1 X (1) (2)
sup Êk+k∗ ,` D̂k+k∗ ,` − 2θ (1 − θ )Wn(1) (−t )
t ∈[−N ,0] n `=1

k∗
!
d n
2k∗ + k n − k∗ X t
ξi,` − 2θ (1 − θ )Wn(1) (−t ) + oP (1)
X X
= sup δ` ξi,` +
t ∈[−N ,0] n n `=1 i=k∗ +k+1
nkδn k22 i=1
d k∗
ξi,` − Wn(1) (−t ) + oP (1)
X X
= O (1) sup δ`
t ∈[−N ,0] `=1 i=k∗ +k+1

= oP (1),
where k denotes the integer part of t /kδn k22 . Similarly,

d
1 X (3) (4)
sup Êk+k∗ ,` D̂k+k∗ ,` − 2θ (1 − θ )Wn(2) (t ) = oP (1).
t ∈[0,N ] n `=1
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2265

It remains to verify that the remaining parts in displays (4.1) and (4.2) do not contribute to the limit distribution. We observe
that
k∗
!
k n
1 (1) (2) k∗ − k n − k∗ X X k + k∗ X
max D̂k,` Êk,` = max δ` ξ̂i,` + ξ̂i,` − ξ̂i,` + oP (1)
k∗ −Nδ ≤k≤k∗ n k∗ −Nδ ≤k≤k∗ n n i =1 i=1
n i=1

k∗
!
k n
1 X X k + k∗ X
= O (1) max ξ̂i,` + ξ̂i,` − ξ̂i,` + oP (1)
k∗ −Nδ ≤k≤k∗ nkδn k2 i=1 i=1
n i=1

= oP (1),
since, by (4.6) and the weak convergence of partial sums,
k k
1 X 1 X
max ξ̂i,` ≤ max∗ ξ̂i,`
k∗ −Nδ ≤k≤k∗ nkδn k2 i =1
1≤k≤k nkδn k2 i =1
s s !
k k
1 1 X 1 1 X
= max √ ξ̂i,` = max √ ξi,` + oP (1)
nkδn k22 1≤k≤k∗ n i=1
nkδn k22 1≤k≤k∗
n i=1

= oP (1). (4.9)

Next, we note that (4.6) and the weak convergence of partial sums imply that
k∗ k
D
X X
max ξ̂i,` = max ξ̂i,`
k∗ −Nδ ≤k≤k∗ 1≤k≤Nδ
i=k+1 i =1
!
k k
p 1 X p 1 X
= Nδ max √ ξ̂i,` = Nδ max √ ξi,` + oP (1)
1≤k≤Nδ Nδ i=1
1≤k≤Nδ Nδ i=1
p 
= OP Nδ . (4.10)

Similarly

n
k∗ − k X p 
max ξ̂i,` = oP Nδ . (4.11)
k∗ −Nδ ≤k≤k∗ n i=1

Hence, we have from (4.9)–(4.11) that


k∗ k∗
! !
n k n
1 (1) (2) 1 X k − k∗ X X X k + k∗ X
max Êk,` Êk,` = max ξ̂i,` + ξ̂i,` ξ̂i,` + ξ̂i,` − ξ̂i,`
k∗ −Nδ ≤k≤k∗ n k∗ −Nδ ≤k≤k∗ n i=k+1
n i=1 i=1 i=1
n i=1

1 p  √ 
= OP Nδ OP n = oP (1).
n
Similar arguments hold for the other terms coming from (4.2). The proof is complete. 

Appendix A. Verification of Eq. (4.4)

Lemma A.1. Under the assumptions of Theorem 2.1 it holds that, for all 1 ≤ ` ≤ d and ε > 0,

|Êk(,`
1) (2)
!
Êk,` |
lim lim sup P max ≥ε = 0.
N →∞ n→∞ 1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) |

Proof. Let 1 ≤ ` ≤ d and 1 ≤ k ≤ k∗ − N for some N ≥ 1. From the definition in (4.1) and the argument leading to display
(1) (2)
(4.3) it follows that the absolute value of the estimated deterministic term |D̂k,` D̂k,` | has precise stochastic order n(k∗ − k).
Hence,

|Êk(,`
1) (2)
Êk,` | |Êk(,`
1)
| |Êk(,`
2)
|
max = O (1) max max = O (1)M1 (N , n)M2 (N , n),
1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) | 1≤k≤k∗ −N k∗ − k 1≤k≤k∗ −N n
2266 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

where O (1) does not depend on N. We start by examining M1 (N , n). For any α ∈ (1/2, 1) we have that
k∗ k k
1 X D 1 X 1 1 X
max ξ̂i,` = max ∗ ξ̂i,` ≤ sup ξ̂i,` .
1≤k≤k∗ −N k∗ − k i=k+1 N <k≤k k i =1
N 1−α k≥1 kα i=1

Replacing the functional limit theorem with the law of the iterated logarithm in (4.6) we conclude that
k k
1 X X
sup α ξ̂i,` − ξi,` = oP (1).
k≥1 k i =1 i =1

The law of the iterated logarithm for partial sums yields


k
1 X
max α ξi,` = OP (1).
k≥1 k
i=1

Similarly, by (4.6) and the central limit theorem,


n
1 X
ξ̂i,` = oP (1).
n i=1

Thus we have, for all ε > 0,

lim lim sup P M1 (N , n) ≥ ε = 0.



N →∞ n→∞

Similar arguments can be applied to M2 (N , n) and we get

lim lim sup P M2 (N , n) ≥ ε = 0



N →∞ n→∞

resulting in

lim lim sup P M1 (N , n)M2 (N , n) ≥ ε = 0



N →∞ n→∞

and the lemma is proved. 

Lemma A.2. Under the assumptions of Theorem 2.1 it holds that, for all 1 ≤ ` ≤ d and ε > 0,

|Êk(,`
1) (2)
!
D̂k,` |
lim lim sup P max ≥ε = 0.
N →∞ n→∞ 1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) |

Proof. Write

|Êk(,`
1) (2)
D̂k,` | |Êk(,`
1)
| |D̂(k2,`) |
max = OP (1) max max = OP (1)M1 (N , n)M3 (N , n),
1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) | 1≤k≤k∗ −N k∗ − k 1≤k≤k∗ −N n

where M1 (N , n) has already been dealt with in Lemma A.1. Noticing that
k + k∗ n − k∗
M3 (N , n) = max δ` + oP (1) = OP (1)
1≤k≤k∗ −N n n
hence yields the assertion. 

Lemma A.3. Under the assumptions of Theorem 2.1 it holds that, for all 1 ≤ ` ≤ d and ε > 0,

|D̂(k1,`) Êk(,`
2)
!
|
lim lim sup P max ≥ε = 0.
N →∞ n→∞ 1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) |

Proof. In an analogous fashion, we obtain

|D̂(k1,`) Êk(,`
2)
| |D̂(k1,`) | |Êk(,`
2)
|
max = O (1) max max = OP (1)M4 (N , n)M2 (N , n)
1≤k≤k∗ −N |D̂(k1,`) D̂(k2,`) | 1≤k≤k∗ −N k∗ − k 1≤k≤k∗ −N n
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2267

with M2 (N , n) from Lemma A.1. Therefore


n − k∗ k∗ − k
M4 (N , n) = max δ` + oP (1) = OP (1)
1≤k≤k∗ −N n k∗ − k
gives the result. 

Similar calculations can be be performed for the terms appearing in display (4.2). Details are omitted.

Appendix B. Verification of Eq. (4.7)

Lemma B.1. Under the assumptions of Theorem 2.2 it holds that, for all ε > 0,
 
d
|Êk(,`
1) (2)
P
 Êk,` |
`=1

lim lim sup P  max ≥ ε  = 0.
 
N →∞ 1≤k≤k∗ −Nδ Pd
|D̂(k1,`) D̂(k2,`) |
n→∞ 
`=1

Proof. Observe that, uniformly in k,


d
|D̂(k1,`) D̂(k2,`) | ∼P n(k∗ − k)kδn k22 .
X

`=1

Therefore, for any 1 ≤ ` ≤ d,


1) (2)
|Êk(,` Êk,` | |Êk(,`
1)
| |Êk(,`
2)
|
max = OP (1) max max
1≤k≤k∗ −Nδ d
P (1) (2) 1≤k≤k −Nδ (k − k)kδn k2
∗ ∗ 1≤k≤k∗ −Nδ nkδn k2
|D̂k,` D̂k,` |
`=1

= OP (1)M1δ (N , n)M2δ (N , n),


where OP (1) does not depend on N. We first study the asymptotics of M1δ (N , n). To this end note first that

k∗ k
ξ̂i,`
P
ξ̂i,`
P
i =1 D i=1
max = max .
1≤k≤k∗ −Nδ (k∗ − k)kδn k2 Nδ ≤k≤k∗ +1 kkδn k2
Following the proof of (4.6) we get
 " #2 1/2
k Z 1 k
1 X 1 1 X
max (ξ̂i,` − ξi,` ) ≤ oP (1) max  Yi (t ) dt 
Nδ ≤k≤k∗ +1 kkδn k2 i =1
kδn k2 Nδ ≤k≤k∗ +1 k 0 i =1

1 p
= oP (1) Nδ = oP (1)
kδn k2
by the Hájek–Rényi inequality in Hilbert spaces (see [6]). Using the same inequality a second time gives
!
k
1 X C 1 1
P max ξi,` ≥ x ≤
Nδ ≤k≤k∗ +1 kkδn k2 i=1
x2 kδn k22 Nδ

with some positive constant C . Hence, for all ε > 0,


!
k
1 X
lim lim sup P max ξi,` ≥ x = 0.
N →∞ n→∞ Nδ ≤k≤k∗ +1 kkδn k2 i =1

Furthermore, from (4.6) and the central limit theore we deduce


n
ξ̂i,`
P
i =1
max = oP (1) (n → ∞).
1≤k≤k∗ −Nδ nkδn k2
2268 A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269

Since the same arguments apply also to the term M2δ (N , n), we conclude that

lim lim sup P M1δ (N , n)M2δ (N , n) ≥ ε = 0.



N →∞ n→∞

This proves the assertion. 

Lemma B.2. Under the assumptions of Theorem 2.2 it holds that, for all ε > 0,
 
d
|Êk(,`
1) (2)
P
 D̂k,` |
`=1

lim lim sup P  max ≥ ε  = 0.
 
N →∞ 1≤k≤k∗ −Nδ Pd
|D̂(k1,`) D̂(k2,`) |
n→∞ 
`=1

Proof. Along the lines of the previous proof, we may write

|Êk(,`
1) (2)
D̂k,` | |Êk(,`
1)
| |D̂(k2,`) |
max = OP (1) max max
1≤k≤k∗ −Nδ d
P (1) (2) 1≤k≤k −Nδ (k − k)kδn k2
∗ ∗ 1≤k≤k∗ −Nδ nkδn k2
|D̂k,` D̂k,` |
`=1

= OP (1)M1δ (N , n)M3δ (N , n),


where
k + k∗ n − k∗ δ`
M3δ (N , n) = max + oP (1) = OP (1).
1≤k≤k∗ −Nδ n n kδn k2
Since M1δ (N , n) has already been estimated in Lemma B.1, the proof is complete. 

Lemma B.3. Under the assumptions of Theorem 2.2 it holds that, for all ε > 0,
 
d
|D̂(k1,`) Êk(,`
2)
P
 |
`=1

lim lim sup P  max ≥ ε  = 0.
 
N →∞ 1≤k≤k∗ −Nδ Pd
|D̂(k1,`) D̂(k2,`) |
n→∞ 
`=1

Proof. Write
|D̂(k1,`) Êk(,`
2)
|
max
d
= OP (1)M4δ (N , n)M2δ (N , n)
1≤k≤k∗ −Nδ P (1) (2)
|D̂k,` D̂k,` |
`=1

with
k − k∗ n − k∗ δ`
M4δ (N , n) = max + oP (1) = OP (1)
1≤k≤k∗ −Nδ n n kδn k2
and the lemma is proved. 

Again, the same arguments give the corresponding results for the terms in (4.2).

References

[1] A. Antoniadis, T. Sapatinas, Wavelet methods for continuous time prediction using Hilbert-valued autoregressive processes, Journal of Multivariate
Analysis 87 (2003) 133–158.
[2] A. Antoniadis, T. Sapatinas, Estimation and inference in functional mixed-effects models, Computational Statistics and Data Analysis 87 (2007)
4793–4813.
[3] J.-M. Chiou, H.-G. Müller, J.-L. Wang, Functional response models, Statistica Sinica 14 (2004) 675–693.
[4] B. Fernández de Castro, S. Guillas, W. Gonzáles Manteiga, Functional samples and bootstrap for predicting sulfur dioxide levels, Technometrics 47
(2005) 212–222.
[5] A. Laukaitis, A. Račkauskas, Functional data analysis for clients segmentation tasks, European Journal of Operational Research 163 (2005) 210–216.
[6] H.-G. Müller, U. Stadtmüller, Generalized functional linear models, The Annals of Statistics 33 (2005) 774–805;
V.V. Petrov, Limit Theorems of Probability Theory, Clarendon, Oxford, 1995.
[7] F. Yao, H.-G. Müller, J.-L. Wang, Functional linear regression analysis for longitudinal data, The Annals of Statistics 33 (2005) 2873–2903.
[8] R.H. Glendinning, S.L. Fleet, Classifying functional time series, Signal Processing 87 (2007) 79–100.
[9] P. Kokoszka, I. Maslova, J. Sojka, L. Zhu, Testing for lack of dependence in functional linear model, Canadian Journal of Statistics 36 (2008) 207–222.
A. Aue et al. / Journal of Multivariate Analysis 100 (2009) 2254–2269 2269

[10] P. Kokoszka, I. Maslova, J. Sojka, L. Zhu, Effect of substorms on the magnetic field variability at mid- and low-latitudes, Technical Report, Utah State
University, 2008.
[11] G.W. Cobb, The problem of the Nile: Conditional solution to a change-point problem, Biometrika 65 (1978) 243–251.
[12] C. Inclán, G.C. Tiao, Use of cummulative sums of squares for retrospective detection of change of variance, Journal of American Statistical Association
89 (1994) 913–923.
[13] R.D. Davis, D. Huang, Y.-C. Yao, Testing for a change in the parameter values and order of an autoregressive model, The Annals of Statistics 23 (1995)
282–304.
[14] J. Antoch, M. Husková, Z. Prásková, Effect of dependence on statistics for determination of change, Journal of Statistical Planning and Inference 60
(1997) 291–310.
[15] R. Garcia, E. Ghysels, Structural change and asset pricing in emerging markets, Journal of International Money and Finance 17 (1998) 455–473.
[16] L. Horváth, P.S. Kokoszka, J. Steinebach, Testing for changes in multivariate dependent observations with applications to temperature changes, Journal
of Multivariate Analysis 68 (1999) 96–119.
[17] P.S. Kokoszka, R. Leipus, Change-point estimation in ARCH models, Bernoulli 6 (2000) 513–539.
[18] I. Berkes, R. Gabrys, L. Horváth, P. Kokoszka, Detecting changes in the mean of functional observations. Journal of the Royal Statistical Society, Series
B (2009+) (in press).
[19] J. Indritz, Methods in Analysis, Macmillan, 1963.
[20] D. Bosq, Linear Processes in Function Spaces, Springer, New York, 2000.
[21] J. Dauxois, A. Pousse, Y. Romain, Asymptotic theory for principal component analysis of a vector random function, Journal of Multivariate Analysis 12
(1982) 136–154.
[22] M. Csörgő, L. Horváth, Limit Theorems in Change-Point Analysis, Wiley, Chichester, 1997.
[23] J.O. Ramsay, B.W. Silverman, Functional Data Analysis, Springer, New York, 2005.
[24] G.R. Shorack, J.A. Wellner, Empirical Processes with Applications to Statistics., Wiley, New York, 1986.

You might also like