For each matrix , there exists a padded diagonal matrix and orthogonal matrices and such that can be expressed as
where
One can often prune columns of or corresponding to zero elements in . For example, if , then we can have
where is the first columns of , and is the first rows (non-zero rows) of .
SVD and Eigendecomposition
SVD of is intimately related to the eigendecomposition of the product of with its transpose:
where if .
One can also note that in PCA, we perform eigendecomposition on the covariance matrix . Therefore, we can also apply SVD to the data matrix to identify the principle eigenvectors.
SVD and Frobenius Norm
The squared Frobenius norm of a matrix is the sum of its squared singular values:
proof
Using the properties of the trace
SVD and Spectral Norm
The spectral norm of a matrix equals the largest singular value :
proof
SVD for Low-Rank Approximation
eckart-young theorem
Given with SVD . Then for all :
where is rectangular and padded accroding to the dimensions of and .
The theorem also gives the following corollary:
corollary
The squared error of low rank approximations can be expressed as
In addition, also minimizes the loss defined by spectral norm:
SVD can be seen as an additive superposition of rank 1 matrices:
SVD for Matrix Completion
For fully observed matrices, SVD is computable with complexity, which is an important example of a solvable non-convex problem. However, in the case of incomplete observations, SVD is in general not application directly to compute low-rank approximations. Low-rank matrix approximation with a weighted Frobenius norm is defined as follows:
where the weights in the collaborative filtering settings, or they can be some positive values. In both case, the problem has been shown to be NP-hard, even for .