Estimation of Covariance Matrices: Roman Vershynin
Estimation of Covariance Matrices: Roman Vershynin
Roman Vershynin
University of Michigan
Roman Vershynin
Covariance matrix
Basic problem in multivariate statistics:
by sampling from a high-dimensional distribution, determine
its covariance structure.
Principal Component Analysis (PCA): detect the principal
axes along which most dependence occurs:
Roman Vershynin
Covariance matrix
Roman Vershynin
n =
1X
Xk XT
k .
n
i=1
Roman Vershynin
Roman Vershynin
1X
1 T
n =
Xk XT
k = A A.
n
n
i=1
Roman Vershynin
n = n1 AT A.
The desired estimation kn I k is equivalent to saying
that 1n A is an almost isometric embedding Rp Rn :
Roman Vershynin
p, smax (A)
n+
p.
Roman Vershynin
Roman Vershynin
p
p
+C .
n
n
Beyond sub-subgaussian
Roman Vershynin
Roman Vershynin
p 12
2
Roman Vershynin
Roman Vershynin
1X
kn k = sup
hXk , xi2 = O(1).
n
xS n1
k=1
1X
Zk 1.
n
k=1
n
n0
for k I .
Roman Vershynin
Roman Vershynin
Roman Vershynin
Covariance graph
Roman Vershynin
Roman Vershynin
Roman Vershynin
kMk
kMk
1,2
+
kk.
n
n
P
where kMk1,2 = maxj ( i mij2 )1/2 is the `1 `2 operator norm.
This result is quite general. Applies for arbitrary Gaussian
distributions (no covariance structure assumed), arbitrary
mask matrices M.
Roman Vershynin
Gaussian Chaos
Since we dont know how to answer this question, the proof of
the estimation theorem takes a different route through
estimating a Gaussian chaos.
A gaussian chaos arises naturally when one tries to compute
the operator
P norm of a sample covariance matrix
n = n1 nk=1 Xk XT
k :
X
kn k = sup hn x, xi =
xS p1
n (i, j)xi xj =
i,j=1
1X
Xki Xkj xi xj
n
k,i,j
References
Roman Vershynin