Functional Canonical Analysis for Square Integrable Stochastic...

26
1 Research supported by NSF Grants DMS 98-03627, DMS 99-71602, and NIH Grant 99-SC-NIH-1028. 2 The California Diabetes Control Program, University of California, San Francisco/DHS, 601 North 7th Street, MS 675, Sacramento, CA 94234, USA. 3 Department of Statistics, University of California, One Shields Avenue, Davis, CA 95616, USA. 1 Functional Canonical Analysis for Square Integrable Stochastic Processes 1 May 30, 2002 (Revision) Guozhong He 2 , Hans-Georg Müller 3 and Jane-Ling Wang 3 We study the extension of canonical correlation from pairs of random vectors to the case where a data sample consists of pairs of square integrable stochastic processes. Basic questions concerning the definition and existence of functional canonical correlation are addressed and sufficient criteria for the existence of functional canonical correlation are presented. Various properties of functional canonical analysis are discussed. We consider a canonical decomposition, in which the original processes are approximated by means of their canonical components. Key words: Canonical correlation, canonical decomposition, covariance operator, functional data analysis, Hilbert-Schmidt operator, inverse problem 1. Introduction Increasingly, data are collected in the form of random functions or curves. Such curve data may be generated by densely spaced repeated measurements, for example in longitudinal studies, or by automatic recordings of a quantity over time. This type of data is becoming more prevalent throughout the sciences and in financial markets, as automated on-line data

Transcript of Functional Canonical Analysis for Square Integrable Stochastic...

Page 1: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

1 Research supported by NSF Grants DMS 98-03627, DMS 99-71602, and NIH Grant 99-SC-NIH-1028.2 The California Diabetes Control Program, University of California, San Francisco/DHS, 601 North 7th

Street, MS 675, Sacramento, CA 94234, USA.3 Department of Statistics, University of California, One Shields Avenue, Davis, CA 95616, USA.

1

Functional Canonical Analysis for

Square Integrable Stochastic Processes1

May 30, 2002 (Revision)

Guozhong He2, Hans-Georg Müller3 and Jane-Ling Wang3

We study the extension of canonical correlation from pairs of random vectors to the

case where a data sample consists of pairs of square integrable stochastic processes. Basic

questions concerning the definition and existence of functional canonical correlation are

addressed and sufficient criteria for the existence of functional canonical correlation are

presented. Various properties of functional canonical analysis are discussed. We consider

a canonical decomposition, in which the original processes are approximated by means of

their canonical components.

Key words: Canonical correlation, canonical decomposition, covariance operator,

functional data analysis, Hilbert-Schmidt operator, inverse problem

1. Introduction

Increasingly, data are collected in the form of random functions or curves. Such curve

data may be generated by densely spaced repeated measurements, for example in longitudinal

studies, or by automatic recordings of a quantity over time. This type of data is becoming

more prevalent throughout the sciences and in financial markets, as automated on-line data

Page 2: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

2

collection facilities are becoming more ubiquitous. Functional data analysis (FDA) is

concerned with data for which the ith observation consists of one or several infinite

dimensional objects such as curves or surfaces (see the book by Ramsay and Silverman

(1997) for an excellent overview). Typically, these objects are considered to be random

elements of some functional space. Many research questions and statistical modeling issues

are indeed best described in functional terms. This motivates the extension of classical

concepts of multivariate data analysis (MDA), such as principal components analysis,

canonical correlation analysis, and linear modeling, to the infinite-dimensional functional

domain.

In this paper, we consider the situation where a data sample consists of pairs of observed

functions. The study of the dependence between the two functions recorded for a sample of

subjects is then often of interest. We thus aim at extending methods for analyzing the linear

correlation between paired observations from the multivariate to the infinite-dimensional

case.

Several approaches have been developed previously for extending multivariate canonical

correlation (Hotelling, 1936) to the functional case. In early work on this problem, Hannan

(1961) and Brillinger (1975) described canonical analysis for multivariate stationary time

series. By invoking stationarity, the problem in this setting may be reduced to classical

multivariate canonical analysis. A theoretical approach, based on angles between subspaces

of functions was developed by Dauxois and Nkiet (1977). A sample version of smoothed

functional canonical correlation was defined in Leurgans, Moyeed and Silverman (1993),

who demonstrated the need for regularization in functional canonical correlation analysis.

They implemented regularization via modified smoothing splines and demonstrated their

technique with an application to the study of human gait movement data; compare also

Olshen et al. (1989). For the related question of extending principal components from

multivariate to functional data, we refer to Rice and Silverman (1991). A different and

promising approach aiming at covariance rather than correlation for pairs of random curves

was proposed in Service, Rice, and Chavez (1998). Regularization for canonical correlation

amounts to restricting the dimension of the problem and can be achieved via a judicious

Page 3: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

3

choice of the roughness penalty for smoothing splines as in Leurgans et al. (1993), or by

alternative approaches that allow to avoid the inversion problem as in Service et al. (1998).

A third approach that will be discussed below is to approximate processes by finite

expansions, for example in terms of eigenfunctions, and to apply canonical correlation

analysis to the resulting finite-dimensional principal components.

In this paper, we address two issues: First, the problem to define functional canonical

correlation in infinite-dimensional space and to identify conditions under which it is well

defined. Second, the representation of pairs of square integrable processes in terms of

canonical basis functions.

The paper is organized as follows: In Section 2, we provide basic notation and introduce

functional canonical correlation based on a classical definition. An alternative definition is

the theorem of Section 3. The main results on existence can be found in Section 4, and a

functional canonical representation is established in Section 5, including a discussion of

special cases and examples. Proofs and auxiliary results are collected in Section 6, and some

pertinent facts from functional analysis in an appendix.

2. Canonical Correlation for Random Vectors and Random Functions

Suppose we observe a sample of bivariate processes (X, Y), where andX � L2(T1)

are jointly distributed L2-processes, Y � L2(T2)

.�E�Z�2� E[<Z Z>] � E�T

(Z(s))2ds < �, for Z � X or Z � Y

Here, T1 and T2 are index sets (intervals or countable sets), and , are two HilbertL2(T1) L2(T2)

spaces of square integrable functions on T1 and T2 with respect to measures µ1 and µ2,

(usually Lebesgue measure or counting measure), with scalar products <u,v>

= , for i = 1, 2. Canonical correlation as defined for finite-dimensional vectors,�u(s)v(s)dµi(s)

, and formally for stochastic processes , isX � �k1, Y � �

k2 X � L2(T1) Y � L2(T2)

characterized as follows: Let in the vector case, in the functional case.Hi � �ki Hi � L2(Ti)

Then the first canonical correlation ρ1 and associated weight functions or vectors u1 and v1

are defined as

(1) ,ρ1 � supu�H1, v�H2

Cov (<u, X>, <v, Y>) � Cov (<u1, X>, <v1, Y>)

Page 4: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

4

where u and v are subject to

(2) .Var (<u, X>) � 1, Var (<v, Y>) � 1

The k-th canonical correlation and weight functions ρk, uk, vk for (X, Y), for k >1, are defined

as

(3) ,ρk � supu�H1, v�H2

Cov (<u, X>, <v, Y>) � Cov (<uk, X>, <vk, Y>)

where u and v are subject to (2), and the k-th pair of canonical variables

(4) is uncorrelated with the (k - 1) pairs ,(Uk, Vk) {(Ui, Vi), i � 1, ..., k � 1}

where and . We shall call the k-th canonicalUk � <uk, X> Vk � <vk, Y> (ρk, uk, vk, Uk, Vk)

components.

For any X and Y, we hold that X and Y are uncorrelated if all their canonical correlations

are zero. This is equivalent to saying that: if and only if ρ1 = 0, sinceCorr(X,Y) � 0

.ρ1 � ρ2 � ... � 0

Regarding the cases where X, Y are stochastic processes, we first consider the special case

where processes X and Y can be represented by a finite number of orthonormal basis

functions. For such finite-dimensional processes, functional canonical analysis is equivalent

to the usual multivariate canonical analysis for the random coefficient vectors. To see this,

let

,X(t) � µX(t) ��k1

i�1ξi θi(t), t � T1, Y(t) � µY(t) ��

k2

i�1ζi φi(t), t � T2

where are the first k1, respectively k2, elements of orthonormal bases of{θi} and {φi}

, respectively, and are random variables with zero meansL2(T1) and L2(T2) {ξi} and {ζi}

and finite variances. We adopt the notation

,θ(t) � (θ1(t), ... ,θk1(t))T, φ(t) � (φ1(t), ... ,φk2

(t))T, 1 < k1, k2 < �

,ξ � (ξ1, ..., ξk1)T, ζ � (ζ1, ..., ζk2

)T

with

.E[ξ] � 0, Var[ξ] � R11, E[ζ] � 0, Var[ζ] � R22, Cov[ξ, ζ] � E[ξ ζT] � R12

Without loss of generality we assume . Then processes X and Y can beµX(t) � 0, µY(t) � 0

Page 5: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

5

written in vector form as

(5) .X(t) � ξTθ(t), Y(t) � ζTφ(t)

As demonstrated in the following theorem, canonical correlations for X and Y in this case are

the same as the canonical correlations for the random vectors ξ and ζ.

Theorem 2.1 The i-th canonical component of , defined by , is related to(ξ, ζ) (σi, ui, vi)

the i-th canonical component of , , through(X(t), Y(t)) (ρi, ui(t), vi(t))

.ui(t) � uTi θ(t), vi(t) � v

Ti φ(t), ρi � σi

Canonical correlation analysis for finite dimensional processes is therefore equivalent to

multivariate canonical correlation. In this simple situation, functional canonical correlation

analysis is then well defined. Difficulties arise however in the more realistic situation where

processes X and Y are genuinely infinite-dimensional, for the following reasons: First, the

definition in this case requires that there are countably many canonical correlations. This

means that the cross-correlation operator R defined in (10) below must be compact.

Secondly, the operator R involves inverse operators. Inversion of functional operators in

functional space is delicate as a compact operator is not invertible in infinite-dimensional

spaces. Thirdly, the canonical weight functions, ui and vi, may not be square integrable. We

address these issues, which are genuine difficulties in Section 4. An alternative

characterization of canonical correlation for the infinite-dimensional case that is well-known

for the functional multivariate case and is useful for our investigation is studied in the next

section.

3. Alternative Characterization of Canonical Correlation

Consider the case of random vectors, i.e., , and rXX = Cov(X) = E(X -H1 � �k1, H2 � �

k2

EX)(X - EX)T, rYY = Cov(Y) and rXY = Cov(X, Y) =E(X - EX)(Y - EY)T. The (k1 × k1)

respectively covariance matrices rXX, rYY are symmetric and nonnegative definite.(k2 × k2)

It is then well known that

(6) ,ρ k � supu�H1, <u,rXXu>�1, v�H2, <v,rYYv�1

<u, rYX> � <uk, rYXvk>

where for , the pairs are uncorrelated with forUk � <uk, X>, Vk � <vk, Y> (Uk, Vk) (Ui, Vi)

Page 6: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

6

i = 1, ..., k-1, k � min(k1, k2), and uk, vk are the k-th weight functions.

Extending this characterization of canonical correlation to the functional case, where

, we define the covariance functionsH1 � L2(T1), H2 � L2(T2)

;rXX(s, t) � Cov[X(s),X(t)], s, t � T1

;rYY(s, t) � Cov[Y(s),Y(t)], s, t � T2

,rXY(s, t) � Cov[X(s),Y(t)], s � T1, t � T2

and the covariance operator ,RXX: L2(T1) � L2(T1)

(7) ,RXX u(s) � �T1

rXX(s, t)u(t)dt, u � L2(T1)

and analogously operators , and . Operators RXXRYY: L2(T2) � L2(T2) RXY: L2(T2) � L2(T1)

and RYY are compact, self-adjoint, and nonnegative definite, and RXY is compact.

Since

Cov(<u, X>, <v, Y>) � E{[<u, X> � E(<u, X>)][<v, Y> � E(<v, Y>)]}� E{[<u, X � E(X)>][<v, Y � E(Y)>]}� <u, RXYv> ,

and

,Var(<u, X>) � <u, RXXu>, Var(<v, Y>) � <v, RYYv>

a characterization of the k-th canonical correlation and weight functions analogous to (6) is

given by:

(8) ,ρk � supu�L2(T1), <u,RXXu>�1, v�L2(T2), <v,RYYv>�1

<u, RYXv> � <uk, RYXvk>

where, in addition, for k >1,

(9) is uncorrelated with .(Uk, Vk) (Ui, Vi) for i � 1, �, k � 1

The canonical components are solely determined by the covariance functions of processes

X and Y, and are not affected by their means. We therefore assume throughout the rest of the

paper that the means vanish,

.E[X(t)] � 0, t � T1, E[Y(s)] � 0, s � T2

Page 7: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

7

4. Existence of functional canonical correlations and functional canonical weight

functions

Intuitively, maximizing the r.h.s. of (8), given the constraints (9), is equivalent to an

eigenanalysis of the cross-correlation operator R of X and Y ,

(10) .R � R �1/2XX RXY R �1/2

YY

That this in fact holds under certain assumptions on the processes is one of the basic

results below (Theorem 4.8). Compactness of the cross-correlation operator R is therefore

a natural condition in order to guarantee that functional canonical correlations exist and are

interpretable. The basic problem is that, unlike the usual situation in the finite dimensional

case, the square roots of covariance operators of L2-processes are not invertible. In infinite-

dimensional spaces canonical correlation corresponds to an inverse problem.

Our approach is to consider a subset of L2, on which the inverse of a compact operator

can be defined. Following Conway (1985), the range of , given byR �1/2XX

(11) ,FXX � {R 1/2XX h : h � L2(T1)}

is characterized by

,FXX � {f � L2(T1) : ��

i�1λ�1

Xi |<f, θi>|2 < �, f � ker(RXX)}

where are the non-zero eigenvalues and eigenvectors of , and ker(RXX) ={λX i, θi} RXX

{h�L2(T) : RXXh = 0}. Defining

(12) ,F�1XX � {h � L2(T1) : h ��

i�1λ�1/2

X i <f, θi>θi, f � FXX}

we find that is a one-to-one mapping from the vector space onto theR 1/2XX F�1

XX L2(T1)

vector space FXX. Thus restricting the domain of the operator to the subset , we canR 1/2XX F�1

XX

define its inverse for asf � FXX

.R �1/2XX f ��

i�1λ�1/2

Xi <f, θi> θi

Then satisfies the usual properties of an inverse in the sense thatR �1/2XX

.R 1/2XX R �1/2

XX f � f, for all f � FXX, and R �1/2XX R 1/2

XX h � h, for all h � F�1XX

Similarly, we define subspaces FYY, for , and define its inverse asF�1YY L2(T2) R 1/2

YY

(13) .R �1/2YY f ��

i�1λ�1/2

Yi <f, φi> φi, for f � FYY

This process is reminiscent of finding a generalized inverse of a matrix.

Page 8: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

8

Denote the adjoint operator of an operator A by A*. The following condition will ensure

that dom R = FYY and dom R* = FYY, where dom refers to the domain on which these operators

are defined.

Condition 4.1 For FXX and FYY defined as above, let

; .RXYR �1/2YY (FYY) FXX RYXR �1/2

XX (FXX) FYY

In order to find sufficient assumptions on processes X and Y which imply that Condition

4.1 holds, we will use the Karhunen-Loève representations of square integrable processes

given by

(14) .X(t) � µX(t) ���

i�1ξiθi(t), t � T1, and Y(s) � µY(s) ��

i�1ζiφi(s), s � T2

Here, the ξi are uncorrelated r.v.’s with , and the ζ i are uncorrelated r.v.’sEξi � 0, Eξ2i � λX i

with , , such that . The functions andEζi � 0 Eζ2i � λY i Σi λX i < �, Σi λY i < � {θi, i � 1, 2, ... }

are eigenfunctions of the covariance operators RXX and RYY, and as such are{φi, i � 1, 2, ... }

orthonormal.

Proposition 4.2 Let X and Y be L2 - processes with Karhunen -Loève expansions as given

by (14). Then Condition 4.1 holds if

(15) .��

i, j�1(E[ξ2

i ]E[ζ2j ])

�1(E[ξiζj])2 < �, i.e., �

i, j�1Corr 2(ξi, ζj) < �

We note that these are sufficient conditions, but are not necessary.

Let

R0 = R*R,

and be the (positive) eigenvalues of R0 with corresponding orthonormalλ1 � λ2 � ... > 0

eigenfunctions . Defineq1, q2, ..., where qi � FYY

(16) .pi � Rqi / λi i � 1, 2, ...

It is well known that canonical correlations and weight functions are obtained in the finite

dimensional case by

(17) .ρi � λi, ui � R �1/2XX pi, and vi � R �1/2

YY qi, for i � 1

However, in the infinite-dimensional case, the weight functions are not well defined in L2

whenever or . The following examples illustrate this problem. We use herepi � FXX qi � FYY

and in the following the tensor notation, where an operator is given byθ� φ : H � H

Page 9: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

9

(18) .(θ� φ)(h) � <h, θ> φ, for h � H

Example 4.3 Consider the case where X and Y have the same eigenfunctions in their

Karhunen-Loève expansions, and suppose that coefficients of different indices are

uncorrelated (this example was suggested by a referee). In this case,

,RXX � ��

i � 1λX i θi � θi, RYY � �

i � 1λY i θi � θi, RXY � �

i � 1E(ξiζi) θi � θi

and

.R � R �1/2XX RXYR �12

YY � ��

i � 1

E(ξiζi)

(Eξ2i Eζ2i )1/2

θi � θi

In order for R to be a well-defined Hilbert-Schmidt operator, we require

,��

i � 1Corr 2(ξi, ζi) < �

i.e., (15) is required already for the existence of the operator R and well-defined canonical

correlations . If (15) is not satisfied, then R is an unbounded operator and theρi � Corr(ξi, ζi)

canonical correlation are not all well-defined. If (15) is satisfied and R is a Hilbert-Schmidt

operator. If in addition, (21) below is also satisfied, then the canonical weight functions will

be . That these are well-defined will be ensured by Theorem 4.8.ui � λ�1/2Xi θi, vi � λ

�1/2Yi θi

Example 4.4 Assume processes X and Y have Karhunen-Loève expansions (14) where

the random variables satisfyξi, ζj

(19) ,λXi � E[ξ2i ] �

1i 2

, λYj � E[ζ2j ] �1j 2

and let

(20) .E[ξi ζj] �1

(i �1)2(j � 1)2, for i, j � 1

We show in Section 6 that equations (19) and (20) can be satisfied by a pair of processes

with appropriate operators . ThenRXX, RYY, and RXY

,Corr(ξi, ζj) �E[ξi ζj]

E[ξ2i ] E[ζ2j ]

�i j

(i � 1)2(j � 1)2, for i, j � 1

with

Page 10: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

10

��

i, j � 1Corr 2(ξi, ζj) � �

i, j � 1

i 2 j 2

(i � 1)4(j � 1)4

� (��

i � 1

i 2

(i � 1)4)2 < (�

i � 2

1i 2

)2� ( π

2

6� 1)2 .

Defining

,c 2� �

i � 1

i 2

(i �1)4, p �

1c �

i � 1

i(i �1)2

θi, and q �1c �

j � 1

j(j �1)2

φj

and observing , and c2 < 1,�p� � �q� � 1

R � ��

i, j � 1Corr(ξi, ζj) θi � φj � �

i, j � 1

i j(i � 1)2(j � 1)2

θi � φj

� (��

i � 1

i(i � 1)2

θi) � (��

j � 1

j(j � 1)2

φj) � c 2 p � q .

Since , we have R *Rp � c 4 q � q

.λ1 � c 4, ρ1 � λ1 � c 2, p1 � p, and q1 � q

However,

,limn � �

�n

i � 1

1λX i

<p, θi>2� lim

n � �

�n

i � 1

1i �2

( 1c 2

i 2

(i � 1)4) � �

which means that , i.e., u1 is not well defined within L2.p1 � FXX

The following condition is seen to guarantee the existence of well-defined weight

functions.

Condition 4.5 Let X and Y be L2-processes which satisfy

(21) .(a) ��

i, j � 1

E 2[ξi ζj]

λ2X i λY j

< �, and (b) ��

i, j � 1

E 2[ξi ζj]

λX i λ2Y j

< �

The following example demonstrates that Condition 4.5 is not satisfied by the processes

X and Y of Example 4.3.

Example 4.6 Let X and Y be the processes defined in Example 4.4. Then,

Page 11: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

11

,E 2[ξi ζj]

λ2XiλYj

�i 4 j 2

(i � 1)4(j � 1)4> 1

j 2, for i, j � 1

and

,E 2[ξi ζj]

λXiλ2

Yj

�i 2 j 4

(i � 1)4(j � 1)4> 1

i 2, for i, j � 1

so that Condition 4.5 is not satisfied for these processes.

Example 4.7 Processes which do satisfy Condition 4.5 are given viaX, Y

,λX i � E[ξ 2i ] � 1

i 2, λY j � E[ζ 2

j ] � 1j 2

and

.E[ξi ζj] �1

(i � 1)3(j � 1)3, for i, j � 1

One can show (Example 4.4), that such a pair of processes exists. ThenX, Y

, E 2[ξi ζj]

λ2XiλYj

�i 4 j 2

(i � 1)6(j � 1)6< 1

(i � 1)2(j � 1)4

and

,E 2[ξi ζj]

λXiλ2

Yj

�i 2 j 4

(i � 1)6(j � 1)6< 1

(i � 1)4(j � 1)2

so that Condition 4.5 is satisfied.

Canonical correlations and weight functions exist under Condition 4.5, according to the

following central result.

Theorem 4.8 Assume that L2-processes X and Y satisfy Condition 4.5. Let (λi, qi), i � 1

be the i-th non-zero eigenvalue and eigenfunction of the operator R0=R*R, where

(see (10)) . Defining pi = Rqi / , the following holds:R� R �1/2XX RXYR �1/2

YY λi

(a) pi � FXX and qi � FYY, i �1;

(b) ;ρi � λi, ui � R �1/2XX pi, and vi � R �1/2

YY qi

(c) ;Corr(Ui, Uj) � <ui, RXX uj> � <pi, pj> � δij

(d) ;Corr(Vi, Vj) � <vi, RYYvj> � <qi, qj> � δij

Page 12: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

12

(e) .Corr(Ui, Vj) � <ui, RXYvj> � <pi, Rqj> � ρiδij

According to Theorem 4.8, the usual properties of canonical correlations and canonical

weights known from multivariate analysis (see (17)) extend to functional canonical analysis

for L2-processes, if Condition 4.5 is satisfied.

5. Canonical decomposition for pairs of random processes

Functional principal components analysis is based on the expansion of an L2-process in

terms of the eigenfunctions of its covariance operator, according to the Karhunen-Loève

Theorem, extending principal components in multivariate analysis to function space. A

similar extension of the finite-dimensional case to the case of L2-processes was considered

by Leurgans et al. (1993), Section 4.3, motivated by expanding (X, Y) as , in(ΣiUiui, ΣiVivi)

terms of the canonical weight functions ui, vi. According to Theorem 5.1 below, an expansion

of this type cannot be expected to converge in L2.

An expansion of pairs of processes in terms of canonical weight functions is desirable,

as it provides a natural approximation and description of pairs of square integrable processes.

We discuss here the feasibility of this expansion for the functional case.

Since , we haveR 1/2XX ui � pi, R 1/2

YY vi � qi

,RXX�Uiui � RXX�<R �1/2XX pi, X>R �1/2

XX pi � RXX�<pi,R�1/2XX X>pi � X

and for the finite dimensional case. This heuristic leads to the followingRYY�Vivi � Y

extension for the infinite-dimensional case:

Theorem 5.1 (Canonical Decomposition) Let be the canonical{(ρi, ui, vi, Ui, Vi), i�1}

components of X and Y with for all i, and pi, qi defined as in (16).ui � L2(T1) and vi � L2(T2)

For , define projections Pk = P and Qk = P , where span{p1,...,k� 1 span{p1, ..., pk} span{q1, ..., qk}

pk} and span{q1,..., qk} are the closed linear spaces generated by {p1,..., pk} and {q1,..., qk}.

Define to be the closures of subspaces FXX, FYY (see (11)) and P , P to be theFXX, FYY FXX FYY

projection operators to the subspaces. Then

(a) X = Xc, k + , Y = Yc, k + where X �

c,k Y �

c,k

Xc, k = R ,1/2XXPkR

�1/2XX X ��

k

i�1UiRXXui, and X �

c,k � (PFXX� R 1/2

XX PkR�1/2XX )X

Page 13: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

13

Yc, k = R 1/2YY QkR

�1/2YY Y ��

k

i�1ViRYYvi, and Y �

c,k � (PFYY� R 1/2

YY QkR�1/2YY )Y;

(b) share the same first k canonical components;(Xc,k, Yc,k) and (X, Y)

(c) The j-th canonical components of are the same as the (j+k)-th canonical(X �

c,k, Y �

c,k)

components of ;(X, Y), j � 1

(d) are uncorrelated, that is,(Xc,k, Yc,k) and (X �

c,k, Y �

c,k)

, , , ;Corr(Xc,k, X �

c,k) � 0 Corr(Yc,k, Y �

c,k) � 0 Corr(Xc,k, Y �

c,k) � 0 Corr(Yc,k, X �

c,k) � 0

(e) If k , then� �, and P�� Pspan{pi : i � 1}

X = X ,c,� � X �

c,� ���

i�1UiRXXui � (PFXX

� R 1/2XX P

�R �1/2

XX )X

Y = .Yc,� � Y �

c,� ���

i�1ViRYYvi � (PFYY

� R 1/2YY Q

�R �1/2

YY )Y

Further, share the same canonical components, ,(Xc,� , Xc,�) and (X, Y) Corr(X �

c,�, Y �

c,�) � 0

and and are uncorrelated ;(Xc,� , Yc,�) (X �

c,� , Y �

c,�)

(f). If span and span , then X = Xc, � and Y = Yc, � .{pi ; i � 1} � FXX {qi ; i � 1} � FYY

The proof is in Section 6.

The canonical decomposition introduced in Theorem 5.1 may be applied to derive

estimation procedures for functional data analysis of pairs of curve data by approximating

pairs of processes with a few significant canonical components, similar to approximating

random vectors or random functions through a few principal components. The proposed

canonical decomposition may also be useful in functional linear model settings. Such

applications are described in He, Müller, and Wang (2001).

We conclude this section with two examples of applications of this canonical

decomposition. The first example concerns the construction of bivariate processes with

prescribed canonical components. Such constructions are useful for the study of functional

canonical correlation and dependence between stochastic processes, and especially for Monte

Carlo simulations.

Example 5.2 Let be two given orthonormal systems.{pi} L2(T1), {qi} L2(T2)

Assume are two sets of independent random variables, such that{Ui} and {Vi}

Page 14: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

14

, EUi � 0, EU 2i �1, EVi � 0, and EV 2

i � 1, for i � 1

and

.EUiVj � ρiδij, for i, j �1, where �i ρ2i < �

Further, let be two positive and decreasing sequences such that{λ1i} and {λ2i}

. We now obtain random processes X and Y which have the givenΣiλ1i < � and Σiλ2i < �

canonical components , as well as covariance operators(ρi, Ui, Vi), for i � 1

and , by settingR11 � Σi λ1i pi � pi R22 � Σi λ2i qi � qi

(23) .X ��iUi λ1i pi, Y ��iVi λ2i qi

In order to verify this, observe

,rXX(s, t) � E[X(s)X(t)] ��i, jE[UiUj] λ1iλ1j pi(s) pj(t) ��iλ1i pi(s)pi(t)

and conclude that . Furthermore,RXX � R11. Similarly, RYY � R22

,rXY(s, t) � E[X(s)Y(t)] ��iρi λ1iλ2i pi(s)qi(t)

leading to

,RXY � R 1/2XX [�iρi (pi � qi)]R

1/2YY � R 1/2

XX RR 1/2YY

where

.R ��iρi (pi � qi)

This shows that the processes (23) have the required properties. Obviously,

, and thus the canonical weight functions arepi � H1 and qi � H2, i � 1

.ui � R �1/21 pi � pi / λ1i, vi � R �1/2

2 qi � qi / λ2i

Hence, comparing (23) and Theorem 4.1 (e), we have

.Xc, � � X and Yc, � � Y

Finally, we note that expansion (23) coincides with Karhunen-Loève expansions (14) for

processes X, Y with , for the case where pi, qi are theξi � Ui λ1i, and ζi � Vi λ2i

eigenfunctions of X, Y, respectively.

Next, we consider processes with finite basis expansions. This illustrates the finite-

dimensional special case of the general result.

Example 5.3 (Canonical Decomposition for Processes with Finite Base Functions) Let

X and Y be a pair of random processes which have a representation (14) with finitely many

basis functions. Then the canonical decomposition for X and Y is equivalent to the canonical

Page 15: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

15

decomposition for the corresponding vectors of random coefficients through

,X(t) � Xc, k(t) � X �

c, k(t) � ξTc, k θ(t) � ξ ���� T

c, k θ(t)

,Y(t) � Yc, k(t) � Y �

c, k(t) � ζTc, k φ(t) � ζ ���� Tc, k φ(t)

for . Here1 � k � min (k1, k2)

ξ � ξc, k � ξ����

c, k and ζ � ζc, k � ζ����

c, k

are the canonical decompositions for . Letting andξ and ζ R11 � Cov(ξc,k), R22 � Cov(ζc,k)

denoting the canonical components of the random vectors by , i = 1, ...,(ξc,k, ζc,k) (ρi, ui, vi)

k, the corresponding approximations for X, Y become

.Xc, k(t) � �k

i � 1(u Ti ξ)(u

Ti R11θ)(t), and Yc, k(t) � �

k

i � 1(v Ti ζ)(v

Ti R11φ)(t)

6. Auxiliary results and proofs

We first establish several auxiliary results that will be needed for the proofs in this

section. The definition and properties of Hilbert-Schmidt operators are included in the

Appendix.

Proposition 6.1 If processes X and Y satisfy (15), then there exists a Hilbert-Schmidt

operator A: L2(T2) � L2(T1) such that A | and A* | .FYY � R FXX � R *

Proof of Proposition 6.1 Define an operator by the infinite matrix on the closed(rij)i, j � 1

subspaces , withspan{θi} and span{φi}

(24) .rij �E[ξiζj]

E[ξ2i ]E[ζ2j ]

, for i, j �1

The inequality (15) implies that A is a Hilbert-Schmidt operator from the closed subspace

span into . We extend dom(A) to such that ={θi : i � 1} L2(T1) L2(T2) L2(T1) A |span{θi, i � 1}�

0, thus ensuring that the extended operator A is a Hilbert-Schmidt operator on . TheL2(T1)

proof for A* is similar. �

Proposition 6.2

(a) Let Xn and Yn be versions of processes X and Y which are truncated at the n-th

component, that is

Page 16: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

16

(25) Xn(s) = ,�n

i�1ξiθi(s), s � T1, and Yn(s) ��

n

i�1ζiφi(s), s � T2

and let Rn = (rij)i, j = 1, ..., n , be the cross-correlation matrix for Xn and Yn. If < �,limn � �

Σni, j �1r

2ij

then Rn � R, with respect to the linear operator norm, where R is a Hilbert-Schmidt operator.

(b) A process X has a finite basis expansion if X = Xn (25) for some n. If X or Y have

finite basis expansions, then R is a Hilbert-Schmidt operator.

(c) If (15) holds, then R* , RR*, R*R are Hilbert-Schmidt operators.

Proof of Proposition 6.2 (a), (b), and (c) can be verified by checking inequality (15) of

Proposition 4.2. �

Proposition 6.3 Assume inequality (15) holds.

(a) Let , be the eigenvalues and eigenfunctions for the operator R0 ={(λi, qi) ; i � 1}

R*R. Then , defined by (16), is the set of eigenvalues and eigenfunctions for{(λi, pi) ; i � 1}

the operator RR* and is an orthonormal system in ;FXX

(b) Let , then Rqi = pi, and R*pi = qi.ρi � λi ρi ρi

Proof of Proposition 6.3

(a)RR * (pi) � RR * ((1/ λi)Rqi) � (1/ λi)R (R0 qi)

� (1/ λi)R (λi qi) � λi (1/ λi Rqi) � λi pi .

Therefore for any , there exits a corresponding eigenvalue and eigenfunction(λi, qi) (λi, pi)

of RR* , and vice versa. The {pi} are orthonormal, as

<pi, pj> � <(1/ λi)Rqi, (1/ λj)Rqj> � (1/ λiλj)<R * Rqi, qj>

� (1/ λiλj)<λi qi, qj> � (1/ λiλj)λiδij � δij .

(b) and . �R *pi � R *(Rqi / ρi) � (ρ2i /ρi)qi � ρiqi Rqi � R (R *pi/ρi) � ρ

2i pi/ρi � ρipi

Proposition 6.4 (Spectral decomposition of R) Let R = R be the�1/2XX RXY R �1/2

YY

correlation operators for X and Y which satisfy inequality (15), and let be theρi, pi, and qi

i-th eigenvalue and eigenvector of R*R, and RR*, respectively. Then, the integral kernel of

R (see (A1) in Appendix) is decomposed as

KER(R)(s, t) ���

i�1ρi pi(s)qi(t).

Proof of Proposition 6.4 From Proposition 6.1, R = is aR �1/2XX RXYR �1/2

YY : FYY � FXX

Page 17: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

17

Hilbert-Schmidt operator, and therefore has an L2 integral kernel. Defining an integral

operator , with integral kernel , we show that A = R.A: FYY � FXX KER(A) � Σ�i�1ρi pi(s)qi(t)

First, for any , v � span{qi, i�1}�

.(Av)(s) � �KER(A)(s,t)v(t)dt ���

i�1ρi pi(s)�qi(t)v(t)dt � 0

Furthermore, �Rv�2

� <Rv, Rv> � <v, R *Rv>

� <v, R0v> � <v,��

i�1 λi Pi v>,where R0 is a compact self-adjoint operator with spectral decomposition . HereR0 � Σiλi Pi

Pi is the projection operator from FYY to the eigenspace defined by span{eigenfunctions

corresponding to λ i}. Thus, . This implies Rv = 0. Next, for any , Piv � 0, i � 1 j � 1

.(Aqj)(s) ���

i�1ρi pi(s)�qi(t)qj(t)dµ2 � ρj pj(s) � (Rqj)(s)

This shows that for , one has Av = Rv, which concludes the proof. �v � span{qi, i�1}

Proposition 6.5 Assume L2-processes X and Y satisfy Condition 4.5. Then, pi � FXX and qi �

FYY, for i �1 .

Proof of Proposition 6.5 Write

,R � ��

i, j � 1rijθi � φj

where, . Since Condition 4.5 implies (15), R is arij � ��

i, j � 1E[ξi ζj](λX iλY j)

�1/2, for i, j � 1

Hilbert-Schmidt operator and can be written as

,R � ��

i � 1ρipi � qi

where are the eigenvalues and eigenvectors of RR* and R*R,(ρ2i , pi) and (ρ2

i , qi)

respectively (Conway, 1985). Then, for any fixed , k � 1

.Rqk � ρkpk � ��

i, j � 1rij<qk, φj>θi

From the definition of FXX,

.pk � FXX iff ��

i � 1λ�1

X i (��

j � 1rij<qk, φj>)2 < �

The right hand side is true because

��

i � 1λ�1

X i (��

j � 1rij<qk, φj>)2

� ��

i � 1λ�2

X i (��

j � 1λ�1/2

Y j E[ξi ζj]<qk, φj>)2

� ��

i � 1λ�2

X i (��

j � 1λ�1

Y jE 2[ξi ζj] ��

j � 1<qk, φj>

2) < � .

Page 18: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

18

The Cauchy inequality and the equality are used in the derivations of the last�j<qk,φj>2� 1

two inequalities. By using Condition 4.5 (a), we show . Similarly we prove pk � FXX qk � FYY

when Condition 4.5 (b) is satisfied. �

Proof of Theorem 2.1 The covariance function for X is

.rXX(s, t) � E[X(s)X(t)] � θT(s)E[ξ ξT]θ(t) � θT(s)R11θ(t)

Similarly, we have . Because therYY(s, t) � φT(s)R22φ(t), and rXY(s, t) � θT(s)R12φ(t)

covariance matrices R11 and R22 for the random coefficient vectors are finite dimensional, we

may assume, without loss of generality, that they are full rank matrices. Any given

can be written as the sum of two components, with andu � L 2(T1) u0 � span{θ1, ... ,θ k1},

, such thatu1 � span{θ1, ,θ k1}�

.u(t) � u0(t) � u1(t) � uTθ(t) � u1(t), where u � R k1

Then,<u, X> � <u0 �u1, ξ

Tθ> � <u Tθ, ξTθ>� u T<θ, θT>ξ � u TIk1

ξ � u Tξ,

where is the identity matrix. Similarly for any given , ,Ik1k1 × k1 v � L 2(T2) <v, Y> � v Tζ

where . Hence, v � R k2

,E[<u, X>] � u TE[ξ] � 0

,E[<v, X>] � v TE[ζ] � 0

,Var(<u, X>) � E[<u, X>2] � u TE[ξ ξT]u � u TR11u

,Var(<v, Y>) � E[<v, Y>2] � v TE[ζ ζT]v � v TR22v

.Cov[<u, X>,<v, Y>] � E[<u, X><v, Y>] � u TE[ξ ζT]v � u TR12v

Consider the first canonical correlation for X and Y,

,ρ1 � supu�L2(T1), v�L2(T2)

Cov (<u, X>, <v, Y>) �: Cov (<u1, X>, <v1, Y>

subject to

.Var (<u, X>) � 1, and Var (<v, Y>) � 1

This is seen to be equivalent to

,ρ1 � supu�R k1, v�R k2

u TR12v �: u T1 R12v1

subject to

Page 19: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

19

.u TR11u � 1, v TR22v � 1

This is exactly the definition of the first canonical correlation between the random vectors

ξ and ζ. On the other hand, if we start with the first canonical components for ξ and ζ, given

by , we obtain the first canonical components for by (ρ1, u1, v1) X(t) and Y(t)

.(ρ1, u1(t), v1(t)) � (ρ1, uT

1 θ(t), v T1 φ(t))

Similarly, we can extend this to the second canonical component, and so on. �

Proof of Proposition 4.2 To prove the first inclusion, i.e., , we noteRXYR �1/2YY (FYY) FXX

that . From (12) and (14),E[ξ2i ] � λXi, and E[ζ2j ] � λYj

R �1/2YY v ��

i�1λ�1/2

Yi <v, φi> φi, for v � FYY, and

RXYR �1/2YY u ��

i�1(�

j�1λ�1/2

Yj E[ξiζj]<v, φj>)θi, for v � FYY .

Thus if (15) holds, then

.�i λ

�1Xi (�j λ

�1/2Yj E[ξiζj]<u, φj>)2 ��i λ

�1Xi (�j λ

�1Yj (E[ξiζj])

2|<u, φj>|2)

��ij λ�1Xi λ

�1Yj (E[ξiζj])

2 �j |<u, φj>|2

��ij Corr 2(ξi, ζj)�u�2 < � .

The second inclusion follows similarly. �

Proof for Example 4.3 Set

,a 2i �

1i 2

, ci �1

(i � 1)2, for i � 1, 2, ...

and define the operators

, and .R1 � ��

i � 1a 2

i θi � θi, R2 � ��

i � 1a 2

i φi � φi R3 � ��

i, j � 1ci cj θi � φj

The property that implies that R1 and R2 are self-adjoint and positive definite�ia2i < �

operators, and the property that implies that R3 is a Hilbert-Schmidt operator.�ijc2i c 2

j < �

By using (19), we show that . It remains to show that R3 can be interpretedR1 � RXX, R2 � RYY

as a cross-covariance operator RXY. For any , definen � 1

Page 20: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

20

,Ak � Diag(a1, ..., ak), Bnk � (ci cj)i � 1, ..., n, j � 1, ..., k � CnCTk , for k � 1, ..., n

where , andCl � (c1, �, cl)T

Mnk �

An Bnk

B Tnk Ak

An CnCTk

CkCTn Ak

, for k � 1, ..., n .

Then

Mnn �An CnC

Tn

CnCTn An

� M��

R1 R3

R *3 R2

, as n � � .

We want to show that, for , is positive definite, so that M� is also positive definite,n � 1 Mnn

and therefore defines a covariance operator for the pair of L2-processes X and Y such that

. It will suffice to show that , for . As for any n �1,R3 � RXY Det(Mnk) > 0 k � 1, ..., n

,�n

i � 1

c 2i

a 2i

� �n

i � 1

i 2

(i � 1)4< �

i � 1

1(i � 1)2

�π2

6� 1 < 1

we find indeed

Det(Mnk) � Det(An) Det(Ak � B Tnk A �1

n Bnk)

� ( Πn

i � 1a 2

i ) Det(Ak � CkCTn A �1

n CnCTk )

� ( Πn

i � 1a 2

i ) Det(Ak � (�n

i � 1

c 2i

a 2i

)CkCTk )

� ( Πn

i � 1a 2

i ) Det(Ak) Det(1 � (�n

i � 1

c 2i

a 2i

)C Tk A �1

k Ck)

� ( Πn

i � 1a 2

i ) ( Πk

i � 1a 2

i ) [1 � (�n

i � 1

c 2i

a 2i

) (�k

i � 1

c 2i

a 2i

)] > 0. �

Proof of Theorem 4.8 Part (a) follows from Proposition 6.5. The proof for part (b) - (e)

can be found in He, Müller, and Wang (2000). �

For the proof of Theorem 5.1, we need the following additional auxiliary results.

Lemma 6.6 Assume Condition 4.5 is satisfied. Then, for , k � 1

(a) and , if ;<u, Xc,k> � <u, X> <u, X �

c,k> � 0 R1/2XX u � span{pi, i � 1, �, k}

Page 21: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

21

(b) and , if ;<v, Yc,k> � <v, Y> <v, Y �

c,k> � 0 R1/2YY v � span{qi, i � 1, �, k}

(c) and , if ; <u, Xc,k> � 0 <u, X �

c,k> � <u, X> R1/2XX u � span{pi, i � 1, �, k}�

(d) and , if .<v, Yc,k> � 0 <v, Y �

c,k> � <v, Y> R1/2YY v � span{qi, i � 1, �, k}�

Proof. We only prove (a) and (c).

(a). is equivalent to , and, for ,R 1/2XX u � span{pi, i � 1, �, k} u � span{ui, i � 1, �, k} i � k

,<ui, Xc,k> � <R �1/2XX pi, �

kj�1Uj R

1/2XX pj> ��k

j�1Uj <pi, pj> � Ui � <ui, X>

.<ui, X �

c,k> � <ui, X � Xc,k> � <ui, X> � <ui, Xc,k> � 0

(c). Let . Then,R 1/2XX u � span{p1, �, pk}

,<u, Xc,k> � <u, �k1Uj RXX uj> ��k

1Uj<R 1/2XX u, R 1/2

XX uj> ��k1Uj<R 1/2

XX u, pj> � 0

. �<ui, X �

c,k> � <ui, X � Xc,k> � <ui, X> � <ui, Xc,k> � <ui, X>

Lemma 6.7 For , ,1 �i �k u � F�1XX, v � F�1

YY

R iff <u,X> and <ui, X> are uncorrelated; and,1/2XX u � span{p1, �, pk}

R iff <v, Y> and <vi, Y> are uncorrelated.1/2YY v � span{q1, �, qk}

Proof. Consequence of

Corr(<u, X>, <ui, X>) � E[<u, X> <ui, X>] � <u, RXXui>� <R 1/2

XX u, R 1/2XX ui> � <R 1/2

XX u, pi>.�

Proof of Theorem 5.1

(a) Note that for ,i � 1, f � FXX

(pi � pi)R�1/2XX (f) � <pi, R �1/2

XX (f)> pi � <R �1/2XX (pi), f> pi

� R �1/2XX (pi) � pi(f) � ui � pi(f),

that is, . Hence,(pi � pi)R�1/2XX � ui � pi on FXX

on FXX.PkR�1/2XX � �

k

i � 1(pi � pi)R

�1/2XX � �

k

i � 1ui � pi

Since , the operator has the L2 kernel , andui � L2(T1), i � 1 Σki � 1ui � pi Σk

i � 1pi(s)ui(t)

therefore is a Hilbert-Schmidt operator on . AsL2(T1)

,supf � FXX

�PkR�1/2XX f� � sup

f � FXX

��k

i � 1(ui � pi)f� � ��

k

i � 1ui � pi� �f�

can be extended to a bounded operator on FXX, and further to a Hilbert-SchmidtPkR�1/2XX

operator on . NowL2(T1)

Page 22: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

22

,PkR�1/2XX X � �

k

i � 1<ui, X>pi � �

k

i � 1UiR

1/2XX ui

and this implies

, .Xc,k � R 1/2XX PkR

�1/2XX X ��

k

i�1UiRXX ui X �

c,k � X � Xc,k

(b) For , with span{p1, ...,u � FXX, write u � PkR1/2XX u � (PFXX

� PkR1/2XX )u � u1 � u2 u1 �

pk}, . Similarly, for , write . From Lemma 6.6,u2 � span{p1, ..., pk}� v � FYY v � v1 � v2

,<u, Xc,k> � <u1, Xc,k> � <u1, X>, and <v, Yc,k> � <v1, Yc,k> � <v1, Y>

and

(26)

supu� L2(T1), v � L2(T2)

Cov(<u, Xc,k>, <v, Yc,k>)

� supu1� span{u1, ..., uk}, v1 � span{v1, ..., vk}

Cov(<u1, X>, <v1, Y>)

� supu� L2(T1), v � L2(T2)

Cov(<u, X>, <v, Y>).

For i = 1, ..., k,

, <ui, Xc,k> � �k

j � 1Uj<ui,RXXuj> � �

k

j � 1Ujδij � Ui, <vi, Yc,k> � Vi

and

.Cov(<ui, Xc,k>, <vi, Yc,k>) � E[Ui Vi] � ρi � supu� L2(T1), v � L2(T2)

Cov(<u, X>, <v, Y>)

This implies

,Var(<ui, Xc,k>) � 1 and Var(<vi, Yc,k>) � 1

and for i > 1, j = 1, ..., i-1,

.Cov(<ui, Xc,k>, <uj, Xc,k>) � 0 and Cov(<vi, Yc,k>, <vj, Yc,k>) � 0

Hence, , are the i-th canonical correlation and weight functions for(ρi, ui, vi), i � 1, ..., k

, and are identical to those for X and Y.Xc,k and Yc,k

(c) From Lemma 6.6, for . Let as in theu � FXX, and v � FYY u � u1 � u2 and v � v1 � v2

above. Then,

, which is uncorrelated with Ui, i = 1, ..., k, <u, X �

c,k> � <u2, X �

c,k> � <u2,X>

, which is uncorrelated with Vi, i = 1, ..., k.<v, Y �

c,k> � <v2, Y �

c,k> � <v2,Y>

From constraint (4) and the facts that is uncorrelated with span{u1, ..., uk}, and isu2 v2

uncorrelated with span{v1, ..., vk}, we have

Page 23: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

23

. Can Corr (X �

c,k, Y �

c,k)i � Can Corr (X, Y)i � k, for i � 1

On the other hand, we have that satisfy constraints (2) and (4) for {(ui � k, vi � k) : i � 1}

, so that(X �

c,k, Y �

c,k)

Can Corr (X �

c,k, Y �

c,k)i � Cov(<ui � k, X �

c,k>,<vi � k, Y �

c,k>)� ρi � k � Can Corr(X, Y)i � k.

Using a similar argument as for the proof of (b), we have that , the{(ρi � k, ui � k, vi � k) : i � 1}

(i + k)-th canonical correlation and weight functions for X and Y, are the i-th canonical

correlation and weight functions for .Xc,k and Yc,k

(d) We only provide the proof of the third equality, since the proofs of the other

equalities are similar. For , let and as in theu � FXX, and v � FYY u � u1 � u2 v � v1 � v2

proof for (b). Then,

, <u, Xc,k> � <u1, X>, <v, Y �

c,k> � <v2, Y>

and

,Cov(<u, Xc,k>, <v, Y �

c,k>) � E[<u1, X><v2, Y>] � <p1, Rq2>

where . Fromp1 � R 1/2XX u1 � span{p1, ..., pk} and q2 � (IFYY

� R 1/2YY )v2 � span{q1, ..., qk}

Proposition 6.3 (b), , and .Rq2 � span{p1, ..., pk}� <p1, Rq2> � 0

(e) From (a),

.Xc, k � �k

i � 1UiR

1/2XX pi, for k � 1

Now the fact that

,�k

i � 1E�UiR

1/2XX pi�

2� �

k

i � 1E[U 2

i ]�R 1/2XX pi�

2� �

k

i � 1<pi, R 1/2

XX pi>

and

<pi, RXXpi> � ��

j � 1λX j<pi, θj>

2

imply for all , k � 1

�k

i � 1E�UiR

1/2XX pi�

2� �

k

i � 1��

j � 1λX j<pi, θj>

2� �

j � 1λX j(�

k

i � 1<pi, θj>

2)

� ��

j � 1λX j�Pkθj�

2 � ��

j � 1λX j�θj�

2� �

j � 1λX j < � .

Hence,

Page 24: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

24

,Xc,k � R 1/2XX PkR

�1/2XX � R 1/2

XX P�R �1/2

XX X � Xc,� � ��

i � 1UiRXXui, as k � �

where the convergence is in the mean squared norm . This implies (e).E� � �2L2

(f) If , then . �span{pi : i � 1} � FXX P�� PFXX

, and X �

c,� � 0

Appendix

We compile here some facts from functional analysis, which are listed without proof.

Details can be found in texts such as Conway (1985) or Dunford and Schwartz (1988).

We use the tensor product notation (18) throughout. Let H1, H2 be Hilbert spaces. We

denote the set of bounded linear operators by B(H1, H2).A : H1� H2

A linear operator is compact if the set {Ah | h� H 1, has a compact closure in H2.�h� � 1}

The kernel of a linear operator A is defined as ker(A)= . A compact self-{h � H1 | Ah � 0}

adjoint operator can be expressed as

,A ��i µi ei � ei

for a sequence of of real numbers and an orthonormal basis {ei} of . For an{µi} ker(A)�

operator , and a subspace , denote by the restriction to subspace H.A: L2 � L2 H L2 A |HFor , the integral operator defined byk � L2(T1×T2) A: L2(T1) � L2(T2)

,(Af)(t) � �T1

k(s, t) f(s) ds, f � L 2(T1)

is a compact operator. We define the integral kernel of A as .KER(A)(s, t) � k(s, t)

The covariance operator RXX (7) is a compact self-adjoint nonnegative operator, since

.KER(RXX)(s, t) � KER(RXX)(t, s)

A bounded linear operator is a Hilbert-Schmidt operator if there exists an orthonormal

basis {ei} of H1 such that . Properties of Hilbert-Schmidt operators are asΣi �Aei�2 < �

follows:

If M1 and M2 are measurable subsets of �p and �q respectively, the operator

is a Hilbert-Schmidt operator if and only if there exists aA � B(L2(M1), L2(M2))

such thatk � L2(M1×M2)

(A1) ,(Af)(s) � �M1

k(s, t) f(t) dt, for f � L2(M1)

i.e., A is an integral operator. Every Hilbert-Schmidt operator is compact. An operator

is a Hilbert-Schmidt operator if and only if the adjoint operator A* of A is aA � B(H1, H2)

Page 25: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

25

Hilbert-Schmidt operator.

Let be orthonormal bases for Hilbert spaces H1 and H2,{ei : i � 1} and {ei' :i �1}

respectively, and let be an (infinite) matrix with . We define an operator(aij)i, j � 1 aij � �

by AHS : H1 � H2

,AHSh ���

i�1(�

j�1aij<ej, h>) ei

for .h � {u � H1 : limi��

��

j�1aij<ej, u> exists, for j �1, and �

i�1|��

j�1aij<ej, u>|2 <�}

This is a Hilbert-Schmidt operator, if .��

i, j � 1|aij|

2� C 2 < �

Acknowledgments

We wish to thank a referee for constructive comments and helpful suggestions.

References

Ash, R. B. and Gardner, M. F. (1975), Topics in Stochastic Processes, New York: Academic

Press.

Brillinger, D. R. (1975), Time Series. Data Analysis and Theory, Holt, Rinehart and

Winston, New York.

Conway, J. B. (1985), A Course in Functional Analysis, New York: Springer-Verlag.

Dauxois, J. and Nkiet, G.M. (1997) Canonical analysis of two Euclidean subspaces and its

applications. Linear Algebra and its Appl. 264, 355-388.

Dunford, N. and Schwartz, J. T. (1988). Linear Operators, Parts I and II. New York: Wiley.

Hannan, E.J. (1961). The general theory of canonical correlation and its relation to functional

analysis. J. Australian Math. Soc. 2, 229-242.

He, G., Müller, H-G, and Wang, J-L (2000), Extending correlation and regression from

multivariate to functional data. Asymptotics in Statistics and Probability, ed. M. Puri, pp.

197-210.

He, G., Müller, H-G, and Wang, J-L (2002), Functional Linear Modelling and Canonical

Regression Analysis. Manuscript.

Page 26: Functional Canonical Analysis for Square Integrable Stochastic …anson.ucdavis.edu/~mueller/jmva2000-0132revised.pdf · 2002-09-22 · spaces of square integrable functions on T1

26

Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321-377.

Leurgans, S.E., Moyeed, R.A. and Silverman, B.W. (1993), Canonical correlation analysis

when the data are curves, J. R. Statistics, Soc. B 55, 725-740.

Olshen, R.A., Biden, E.N., Wyatt, M.P. and Sutherland, D.H. (1989), Gait analysis and

the bootstrap, Ann. Stat., 1989, 17, 1419-1440.

Ramsay, J.O. and Silverman, B.W. (1997), Functional Data Analysis. New York: Springer.

Service, S.K., Rice, J.A. and Chavez, F.P. (1998). Relationship between physical and

biological variables during the upwelling period in Monterey Bay, CA. Deep-Sea Research

II, 1669-1685.

Rice, J.A. and Silverman, B.W. (1991), Estimating the mean and covariance structure

nonparametrically when the data are curves, J. Royal Statist. Soc. B 53, 233-243.