PCA: Principal Component Analysis

Principal Component Analysis, PCA, is similar to ICA but with an interesting underlying assumption that the signal sources are Gaussian. With the Gaussian assumption, mutual independence is equivalent to mutual uncorrelation. Therefore, the second order statistics is sufficient.

We still use the cocktail party as an example. For a set of independent sources, $x(t) = \{x1(t), x2(t), \ldots , xN(t)\}$, the mixing system produces output set, $y(t) = \{y1(t), y2(t), \ldots , yM(t)\}$,

$y(t)=Hx(t) + N(t)$.

where $H$ is the linear mixing matrix and $N(t)$ is the noise term for each channel.

Mathematically PCA is a method that forces a given data multichannel sequence into a set of independent components by projecting onto a set of orthogonal bases in the signal space. The question is how to obtain such an orthogonal bases set and singular value decomposition provides the solution.

Singular Value Decomposition (SVD)

Singular Value decomposition is an important tool for signal analysis. For any matrix Y, we can always find a set U, V and S, such that,

$Y=USV^{T}$,

where,

1. S is a diagonal matrix with the singular values in descending order (the singular spectrum),
2. The columns of V are the eigenvectors of $C=Y^T Y$,
3. U is the matrix of projections of Y onto the eigenvectors of C, which is the source estimates.

The PCA chooses the eigenvalues in descending order as the source orders. The number of sources is determined by a threshold set according to the signal noise floor. The following describes the PCA algorithm implemented with SVD approach.

1. Find the N eigenvalues above a preset threshold of the square data correlation matrix $C=Y^T Y$. The smaller eigenvalues are contributions from the noise. They should be zeroed out.
2. Find the eigenvectors corresponding with the N larger eigenvalues and use them to construct V as column vectors.
3. Use the first columns of U, the constructed V, and the N eigenvalues to derive the new set of N signals

The reconstructed signals are the dominant components that are orthogonal. The rest of the eigenvalues are contributions from noise, which are removed with the constructed signal. Therefore, SVD based-PCA achieves two goals in one shot, the separated independent components and the all noises in the noise subspace are removed completely.