Prepared a draft for my presentation:
http://ihome.ust.hk/~cclee/document/PRESENTSVD.pdf
The proof in the original text is too short and seemingly incomplete to me, so I modified the proof and filled in many details. Let me first describe terminology used in this notes, and let me for the moment describe everything in R instead of C. Let Rm×n denote the collection of m×n dimensional real matrices. For x∈Rk, write x=(x1,x2,…,xk) (i.e., xk is the kth coordinate of x), define ‖x‖2=√∑ki=1|xi|2.
We also define the (n−1)-sphere to be Sn−1:={x∈Rn:(x1)2+⋯+(xn)2=1}. For A∈Rm×n we define the operator norm
‖A‖2=supx∈Rn,‖x‖2=1‖Ax‖2.
We assume, and what we are trying to prove in my draft, that ASn−1={Ax:x∈Sn−1} is a ``hyperellipse" (or just an ellipse when n≤3). Suppose also that m≥n and A is of full rank, then it is legitimate to assume there are unit vectors u1,…,un which point in the direction of semi-axises of the hyperellipse.
We can describe a general construction for ui's: first find a vector in ASn−1 that has largest length, call it a∈Rm, then find another vector b in ASn−1∩(Ra)⊥ that has largest length, then find another such vector in ASn−1∩(Ra)⊥∩(Rb)⊥ and continue the process until we find n such vectors. This is indeed how we find semi-axises of an ellipse in R3 and R2.
Now in the way above, the length of semi-axis in the direction of ui, σi, is in nonincreasing order, i.e., σ1≥σ2≥⋯≥σn>0. Take the unit vector vi∈A−1(σiui), one gets A[v1⋯vn]⏟:=V=[u1⋯un][σ1⋱σn].
Let un+1,…,um be orthonormal basis in (spanR{u1,…,un})⊥, then RHS of the above equation becomes
[u1⋯um]⏟:=U[σ1⋱σnO]⏟:=Σ,
where O denotes a matrix with only 0 entries. One can rewrite the above as: AV=UΣ. It will be proved in my draft that indeed {vi} is orthonormal, hence we can conclude both V,U are unitary, and we arrive to the expression A=UΣV∗. Owing to this decomposition those ui's are called left-singular vector, vi's are called right-singular vector and ``diagonal" elements (i.e., if Σ=(dij), diagonal elements are dii's) are called singular values. These basically are all the motivation of the general result in my draft.
http://ihome.ust.hk/~cclee/document/PRESENTSVD.pdf
The proof in the original text is too short and seemingly incomplete to me, so I modified the proof and filled in many details. Let me first describe terminology used in this notes, and let me for the moment describe everything in R instead of C. Let Rm×n denote the collection of m×n dimensional real matrices. For x∈Rk, write x=(x1,x2,…,xk) (i.e., xk is the kth coordinate of x), define ‖x‖2=√∑ki=1|xi|2.
We also define the (n−1)-sphere to be Sn−1:={x∈Rn:(x1)2+⋯+(xn)2=1}. For A∈Rm×n we define the operator norm
‖A‖2=supx∈Rn,‖x‖2=1‖Ax‖2.
We assume, and what we are trying to prove in my draft, that ASn−1={Ax:x∈Sn−1} is a ``hyperellipse" (or just an ellipse when n≤3). Suppose also that m≥n and A is of full rank, then it is legitimate to assume there are unit vectors u1,…,un which point in the direction of semi-axises of the hyperellipse.
We can describe a general construction for ui's: first find a vector in ASn−1 that has largest length, call it a∈Rm, then find another vector b in ASn−1∩(Ra)⊥ that has largest length, then find another such vector in ASn−1∩(Ra)⊥∩(Rb)⊥ and continue the process until we find n such vectors. This is indeed how we find semi-axises of an ellipse in R3 and R2.
Now in the way above, the length of semi-axis in the direction of ui, σi, is in nonincreasing order, i.e., σ1≥σ2≥⋯≥σn>0. Take the unit vector vi∈A−1(σiui), one gets A[v1⋯vn]⏟:=V=[u1⋯un][σ1⋱σn].
Let un+1,…,um be orthonormal basis in (spanR{u1,…,un})⊥, then RHS of the above equation becomes
[u1⋯um]⏟:=U[σ1⋱σnO]⏟:=Σ,
where O denotes a matrix with only 0 entries. One can rewrite the above as: AV=UΣ. It will be proved in my draft that indeed {vi} is orthonormal, hence we can conclude both V,U are unitary, and we arrive to the expression A=UΣV∗. Owing to this decomposition those ui's are called left-singular vector, vi's are called right-singular vector and ``diagonal" elements (i.e., if Σ=(dij), diagonal elements are dii's) are called singular values. These basically are all the motivation of the general result in my draft.
No comments:
Post a Comment