格仔 Blog: Material of presentation for SVD

Tuesday, February 14, 2012

Material of presentation for SVD

Prepared a draft for my presentation:

http://ihome.ust.hk/~cclee/document/PRESENTSVD.pdf

The proof in the original text is too short and seemingly incomplete to me, so I modified the proof and filled in many details. Let me first describe terminology used in this notes, and let me for the moment describe everything in $ \mathbb R$ instead of $ \mathbb C$. Let $ \mathbb{R}^{m\times n}$ denote the collection of $ m\times n$ dimensional real matrices. For $ x\in \mathbb R^k$, write $ x = (x^1,x^2,\dots,x^k)$ (i.e., $ x^k$ is the $ k$th coordinate of $ x$), define $ \|x\|_2=\sqrt{\sum_{i=1}^k |x^i|^2}$.

We also define the $ (n-1)$-sphere to be $ S^{n-1}:=\{x\in \mathbb R^n:(x^1)^2+\cdots+(x^n)^2=1\}$. For $ A\in \mathbb R^{m\times n}$ we define the operator norm
\[ \|A\|_2= \sup_{x\in \mathbb R^n, \|x\|_2=1}\|Ax\|_2.\]
We assume, and what we are trying to prove in my draft, that $ AS^{n-1}=\{Ax:x\in S^{n-1}\}$ is a ``hyperellipse" (or just an ellipse when $ n \leq 3$). Suppose also that $ m\ge n$ and $ A$ is of full rank, then it is legitimate to assume there are unit vectors $ u_1,\dots,u_n$ which point in the direction of semi-axises of the hyperellipse.

We can describe a general construction for $ u_i$'s: first find a vector in $ AS^{n-1}$ that has largest length, call it $ a\in \mathbb R^{m}$, then find another vector $ b$ in $ AS^{n-1} \cap (\mathbb R a)^\perp$ that has largest length, then find another such vector in $ AS^{n-1} \cap (\mathbb R a)^\perp\cap (\mathbb R b)^\perp$ and continue the process until we find $ n$ such vectors. This is indeed how we find semi-axises of an ellipse in $ \mathbb R^3$ and $ \mathbb R^2$.

Now in the way above, the length of semi-axis in the direction of $ u_i$, $ \sigma_i$, is in nonincreasing order, i.e., $ \sigma_1\ge \sigma_2\ge\cdots\ge \sigma_n>0$. Take the unit vector $ v_i \in A^{-1}(\sigma_i u_i)$, one gets \[A\underbrace{\left[\begin{array}{c|c|c}v_1&\cdots&v_n\end{array}\right]}_{:=V}=\left[\begin{array}{c|c|c}u_1&\cdots&u_n\end{array}\right]\begin{bmatrix}\sigma_1&& \\ &\ddots & \\ & & \sigma_n\end{bmatrix}.\]

Let $ u_{n+1},\dots, u_m$ be orthonormal basis in $ (\mathrm{span}_{\mathbb R}\{u_1,\dots,u_n\})^\perp$, then RHS of the above equation becomes
\[\underbrace{\left[\begin{array}{c|c|c}u_1&\cdots&u_m\end{array}\right]}_{:= U}\underbrace{\left[\begin{array}{ccc}\sigma_1&& \\ &\ddots & \\ & & \sigma_n \\ \hline &\mathcal O& \end{array}\right]}_{:=\Sigma},\]
where $ \mathcal O$ denotes a matrix with only 0 entries. One can rewrite the above as: $ AV = U\Sigma$. It will be proved in my draft that indeed $ \{v_i\}$ is orthonormal, hence we can conclude both $ V,U$ are unitary, and we arrive to the expression $ A=U\Sigma V^*$. Owing to this decomposition those $ u_i$'s are called left-singular vector, $ v_i$'s are called right-singular vector and ``diagonal" elements (i.e., if $ \Sigma=(d_{ij})$, diagonal elements are $ d_{ii}$'s) are called singular values. These basically are all the motivation of the general result in my draft.

格仔 Blog

Pages

Tuesday, February 14, 2012

Material of presentation for SVD

No comments:

Post a Comment