In this post I want to record a standard result in linear algebra that I haven't paid attention before---the polar decomposition. An immediate consequence that I come up with is the SVD decompsition theorem, which I prove as two consequences, one is for square matrices and one is for general matrices.
In what follows we say that a matrix A is positive if x↦x⋅Ax is always nonnegative, and strictly positive if x↦x⋅Ax forms an inner product.
Digression. When we say that an operator T:H→H on a complex Hilbert space is positive: ⟨Tx,x⟩≥0 for every x, it is necessarily (i.e., can be proved to be) self-adjoint! The key is that we can always recover the value ⟨Tx,y⟩ by just knowing what is ⟨Tx,x⟩ for every x∈H, while this may not happen in real Hilbert spaces.
Theorem (Polar Decomposition). Let A be an n×n matrix over F=R or C, then there is a unitary (orthogonal when F=R) matrix U and a positive matrix P such that A=UP, where P=√A∗A, the unique positive square root of the positive matrix A∗A.
Proof. First we observe that ‖√A∗Ax‖2=⟨√A∗Ax,√A∗Ax⟩=⟨x,A∗Ax⟩=⟨Ax,Ax⟩=‖Ax‖2, therefore ‖√A∗Ax‖=‖Ax‖. Now we define an isometry: S1:range√A∗A→rangeA;√A∗Ax↦Ax, this map is automatically well-defined since it preserves norm. Namely, if √A∗Ax=√A∗Ay, then ‖√A∗A(x−y)‖=‖A(x−y)‖=0, so Ax=Ay.
Now if we can extend S1 to a unitary operator S on Fn, then we will have for every x∈Fn, S(√A∗Ax)=S1(√A∗Ax)=Ax, and we will be done. Note that since S1 is injective, dimrange√A∗A=dimrangeS1 necessarily. Recall that extending unitary T:U→V (with dimU=dimV) "unitarily" to whole vector space is the same as defining unitary operator T′:U⊥→V⊥ with T⊕T′ as an extension of T, which will be easily done in the next paragraph.
We let {u1,…,uk} be an orthonormal basis of (range√A∗A)⊥ and {v1,…,vk} be that of (rangeS1)⊥. We can also define S2:(range√A∗A)⊥→(rangeS1)⊥;a1u1+⋯+akuk↦a1v1+⋯+akvk. Finally we define S=S1⊕S2. It is a direct verification that S preserves length: for u∈dom S1 and v∈dom S2, we have ‖S(u+v)‖2=‖Su+Sv‖2=‖S1u+S2v‖2=‖S1u‖2+‖S2v‖2, since u=√A∗Ax for some x∈Fn and ‖S2v‖=‖v‖, we have ‖S(u+v)‖2=‖S1(A∗Ax)‖2+‖v‖2=‖Ax‖2+‖v‖2=‖√A∗Ax‖2+‖v‖2=‖u‖2+‖v‖2=‖u+v‖2,as desired. Since preserving length is the same as preserving inner product, thus S must be unitary.◼
Remark. A similar statement holds for every bounded linear operator T:H→H on a Hilbert space H, with unitary U here replaced by partial isometry.
Consequence 1. For every n×n matrix A, the matrix √A∗A is often called the positive part of A. Since it is analogues to decomposition of complex number that z=eiθ|z|, it is also common to denote |A|=√A∗A, therefore for every matrix A, there is a unitary matrix U such that A=U|A|. With this notation, it can be easily checked that a matrix is normal if and only if |A|=|A∗| (set A=UP, say).◼
Consequence 2 (Singular Value Decomposition---SVD---of Square Matrices). The baby version of SVD can also be obtained from Polar Decomposition. Let A be any n×n matrix over F, then A=S|A| for some unitary (orthogonal when F=R) matrix S. We may assume A≠0, since |A| is positive definite, by Spectral Theorem there is a unitary matrix (orthogonal when F=R) V and diagonal Σ such that A=S[VΣV∗]=(SV)ΣV∗=UΣV∗, where U=SV is unitary (orthogonal when F=R). If we further arrange columns of V such that Σ=diag(σ1,…,σn) with σ1≥σ2≥⋯≥σn≥0, then the decomposition above is exactly the baby version of SVD (which applies for matrix of arbitrary dimension).◼
Consequence 3 (SVD of General Matrices). The work below will be very messy. In order not to insert "(orthogonal when F=R)" everytime, let me use the term F-unitary with obvious meaning.
Now suppose that A is an m×n matrix, w.l.o.g. let's assume m>n (otherwise consider AT). Extend A to a square by adjoining it an m×(m−n) 0 matrix O, then by consequence 2 there are m×m dimensional F-unitary U and V such that [AO]=UΣV∗, where Σ=diag(σ1,…,σm) with σ1≥σ2≥⋯≥σm≥0.
By our definition, for every i>n, 0=UΣV∗ei=U(m∑k=1σkekv∗k(ei)), since U is invertible, we have σkv∗k(ei)=0 for each k=1,2,…,m and for each i>n.In particular, if RankA=r, then σr+1=0 (and then σi=0 for every i>r), and we then have v∗k(ei)=0 for every k=1,2,…,r and i>n. In other words, v1,…vr∈Fn×{0m−n}.
Now multiplying both sides of (???) by [In×nO], we have
A=UΣV∗m×n, where V∗m×n is the left m×n part of the matrix V∗. Denote D=diag(σ1,…,σn)---the upper left n×n part of Σ, then A=U[DOn,m−nOm−n,nOm−n,m−n]V∗m×n=U[DV∗n×nOm−n,n]=U[DO]V∗n×n, where V∗n×n denotes the upper left n×n part of V∗.
We are almost there, except the fact that V∗n×n is not necessarily F-unitrary. But we can always rearrange V, if necessary, to achieve this. For this, let's study the term DV∗n×n.
Since σi=0 for i>r, we have DV∗n×n=[σ1v′∗1⋮σrv′∗r0⋅v′∗r+1⋮0⋅v′∗n],
where v′ means the first n coordinates of v∈Fm. We may replace vr+1,…,vn∈Fm freely such that v1,…,vn form an orthogonal basis of Fn×{0m−n} (recall (???)) without changing DV∗n×n, thus we may assume V∗n×n itself is F-unitary, and we are done.◼
No comments:
Post a Comment