Processing math: 100%

Friday, May 16, 2014

Polar Decomposition of Matrices with an Application in Deriving SVD (to be added in my linear algebra notes)

In this post I want to record a standard result in linear algebra that I haven't paid attention before---the polar decomposition. An immediate consequence that I come up with is the SVD decompsition theorem, which I prove as two consequences, one is for square matrices and one is for general matrices.

In what follows we say that a matrix A is positive if xxAx is always nonnegative, and strictly positive if xxAx forms an inner product.

Digression. When we say that an operator T:HH on a complex Hilbert space is positive: Tx,x0 for every x, it is necessarily (i.e., can be proved to be) self-adjoint! The key is that we can always recover the value Tx,y by just knowing what is Tx,x for every xH, while this may not happen in real Hilbert spaces.

Theorem (Polar Decomposition). Let A be an n×n matrix over F=R or C, then there is a unitary (orthogonal when F=R) matrix U and a positive matrix P such that A=UP, where P=AA, the unique positive square root of the positive matrix AA.

Proof. First we observe that AAx2=AAx,AAx=x,AAx=Ax,Ax=Ax2, therefore AAx=Ax. Now we define an isometry: S1:rangeAArangeA;AAxAx, this map is automatically well-defined since it preserves norm. Namely, if AAx=AAy, then AA(xy)=A(xy)=0, so Ax=Ay.

Now if we can extend S1 to a unitary operator S on Fn, then we will have for every xFn, S(AAx)=S1(AAx)=Ax, and we will be done. Note that since S1 is injective, dimrangeAA=dimrangeS1 necessarily. Recall that extending unitary T:UV (with dimU=dimV) "unitarily" to whole vector space is the same as defining unitary operator T:UV with TT as an extension of T, which will be easily done in the next paragraph.

We let {u1,,uk} be an orthonormal basis of (rangeAA) and {v1,,vk} be that of (rangeS1). We can also define S2:(rangeAA)(rangeS1);a1u1++akuka1v1++akvk. Finally we define S=S1S2. It is a direct verification that S preserves length: for udom S1 and vdom S2, we have S(u+v)2=Su+Sv2=S1u+S2v2=S1u2+S2v2, since u=AAx for some xFn and S2v=v, we have S(u+v)2=S1(AAx)2+v2=Ax2+v2=AAx2+v2=u2+v2=u+v2,as desired. Since preserving length is the same as preserving inner product, thus S must be unitary.

Remark. A similar statement holds for every bounded linear operator T:HH on a Hilbert space H, with unitary U here replaced by partial isometry.

Consequence 1. For every n×n matrix A, the matrix AA is often called the positive part of A. Since it is analogues to decomposition of complex number that z=eiθ|z|, it is also common to denote |A|=AA, therefore for every matrix A, there is a unitary matrix U such that A=U|A|. With this notation, it can be easily checked that a matrix is normal if and only if |A|=|A| (set A=UP, say).

Consequence 2 (Singular Value Decomposition---SVD---of Square Matrices). The baby version of SVD can also be obtained from Polar Decomposition. Let A be any n×n matrix over F, then A=S|A| for some unitary (orthogonal when F=R) matrix S. We may assume A0, since |A| is positive definite, by Spectral Theorem there is a unitary matrix (orthogonal when F=R) V and diagonal Σ such that A=S[VΣV]=(SV)ΣV=UΣV, where U=SV is unitary (orthogonal when F=R). If we further arrange columns of V such that Σ=diag(σ1,,σn) with σ1σ2σn0, then the decomposition above is exactly the baby version of SVD (which applies for matrix of arbitrary dimension).

Consequence 3 (SVD of General Matrices). The work below will be very messy. In order not to insert "(orthogonal when F=R)" everytime, let me use the term F-unitary with obvious meaning.

Now suppose that A is an m×n matrix, w.l.o.g. let's assume m>n (otherwise consider AT). Extend A to a square by adjoining it an m×(mn) 0 matrix O, then by consequence 2 there are m×m dimensional F-unitary U and V such that [AO]=UΣV, where Σ=diag(σ1,,σm) with σ1σ2σm0.

By our definition, for every i>n, 0=UΣVei=U(mk=1σkekvk(ei)), since U is invertible, we have σkvk(ei)=0 for each k=1,2,,m and for each i>n.In particular, if RankA=r, then σr+1=0 (and then σi=0 for every i>r), and we then have vk(ei)=0 for every k=1,2,,r and i>n. In other words, v1,vrFn×{0mn}.
Now multiplying both sides of (???) by [In×nO], we have
A=UΣVm×n, where Vm×n is the left m×n part of the matrix V. Denote D=diag(σ1,,σn)---the upper left n×n part of Σ, then A=U[DOn,mnOmn,nOmn,mn]Vm×n=U[DVn×nOmn,n]=U[DO]Vn×n, where Vn×n denotes the upper left n×n part of V.

We are almost there, except the fact that Vn×n is not necessarily F-unitrary. But we can always rearrange V, if necessary, to achieve this. For this, let's study the term DVn×n.

Since σi=0 for i>r, we have  DVn×n=[σ1v1σrvr0vr+10vn],
where v means the first n coordinates of vFm. We may replace vr+1,,vnFm freely such that v1,,vn form an orthogonal basis of Fn×{0mn} (recall (???)) without changing DVn×n, thus we may assume Vn×n itself is F-unitary, and we are done.

No comments:

Post a Comment