Publication: On the principal components of sample covariance matrices
Open/View Files
Date
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
We introduce a class of M×MM×M sample covariance matrices Q which subsumes and generalizes several previous models. The associated population covariance matrix Σ=[E] Σ=EQ is assumed to differ from the identity by a matrix of bounded rank. All quantities except the rank of Σ−IMΣ−IM may depend on MM in an arbitrary fashion. We investigate the principal components, i.e. the top eigenvalues and eigenvectors, of Q . We derive precise large deviation estimates on the generalized components [] of the outlier and non-outlier eigenvectors [] . Our results also hold near the so-called BBP transition, where outliers are created or annihilated, and for degenerate or near-degenerate outliers. We believe the obtained rates of convergence to be optimal. In addition, we derive the asymptotic distribution of the generalized components of the non-outlier eigenvectors. A novel observation arising from our results is that, unlike the eigenvalues, the eigenvectors of the principal components contain information about the subcritical spikes of ΣΣ . The proofs use several results on the eigenvalues and eigenvectors of the uncorrelated matrix Q , satisfying [E] =IMEQ=IM , as input: the isotropic local Marchenko–Pastur law established in Bloemendal et al. (Electron J Probab 19:1–53, 2014), level repulsion, and quantum unique ergodicity of the eigenvectors. The latter is a special case of a new universality result for the joint eigenvalue–eigenvector distribution.