Rank 1 Matrix Approximation

Outer Product Models

Outer product models are the simplest matrix model that couples entries in each row and each column. Given a matrix ARn×m, we would like to approximate it with the outer product of two vectors uRn and vRm, such that the squared error is minimized:

(u,v)=12ΠΩ(AuvT)F2,

where

ΠΩ(R)=ΩR.

We note that uvT is a rank 1 matrix.

Gradients

If A is fully observed, the objective becomes

(u,v)=12AuvTF2.

Taking partial derivatives with respect to u and v and letting R=uvTA, we have

u=Rv,v=RTu.
derivation

Using the chain rule,
u=Ru12RF2R=ijriju12RF2rij.
The derivative of the Frobenius norm is
12RF2rij=rij.
Also,
riju=uivjaiju=vecin[vj],
where
vecin[vj]:=[0,,vji-th dimension,,0n dimensions]T.
Therefore,
u=ijrijvecin[vj]=ivecin[jrijvj]=Rv.
Similarly, we have
v=RTu.

Convexity

Given the gradients, the hessian matrix is simply calculated

H((u,v))=[vTvIn×n2uvTA(2uvTA)TuTuIm×m].

The Hessian matrix at the origin is

H((0,0))=[0n×nAAT0m×m],

which is not positive semi-definite, unless A is a zero matrix. Therefore the problem is non-convex for all dimensions m,n1.

Solving with Lagrange Multiplier

Using the property that

RF2=tr(RTR),

we can rewrite the objective, using the properties of trace (invariance under transpose and circular shift):

(u,v)=12RF2=12tr(RTR)=12tr(vuTuvTATuvTvuTA+ATA)=12tr(vuTuvT)tr(vuTA)+const=12tr(uTuvTv)tr(uTAv)+const=12u2v2uTAv+const.

We can solve for unit u and v, and then rescale:

argmaxu,vuTAv,s.t. u=v=1.

Using Lagrange multipliers,

L=uTAvλuTuμvTv.

Taking partial derivatives,

Lu=Av2λu=0u=AvAv.

Similarly,

v=ATuATu.

Therefore,

uAATu,vATAv,

which implies u should be proportional to the principal eigenvector of the matrix AAT, and v should be proportional to the principal eigenvector of the matrix ATA. This can be simply done by SVD.