CoCalc Shared Files223 / svd.ipynb
Views : 5
Description: Class example of singular value decomposition

# Example of Singular Value Decomposition

This is a similar example to the one we did in class. Unfortunately the example we did in class did not save (the notebooks should save automatically but something went wrong). I do not remember the exact values I used in class, but I will try to make my matrix close to the one from class.

Start with a $3\times 4$ matrix or rank 2. The third row is simply the sum of the first and second:

In [1]:
A = [2 3 -1 5
3 7 4 -2
5 10 3 3]

3x4 Array{Int64,2}: 2 3 -1 5 3 7 4 -2 5 10 3 3
In [2]:
rank(A)

2

Calculate the s.v.d. and save the results for future use:

In [3]:
(U,S,V) = svd(A, thin=false)

( 3x3 Array{Float64,2}: -0.267261 0.771517 0.57735 -0.534522 -0.617213 0.57735 -0.801784 0.154303 -0.57735, [14.866068747318506,6.244997998398398,3.109246380423892e-16], 4x4 Array{Float64,2}: -0.413493 0.0741249 -0.23022 -0.877797 -0.844963 -0.0741249 0.454084 0.272674 -0.287647 -0.44475 -0.791292 0.305473 -0.179779 0.889499 -0.338621 0.248609)

The vector S contains the singular values:

In [4]:
S

3-element Array{Float64,1}: 14.8661 6.245 3.10925e-16

The third singular value is very close to 0, it is surely a roundin/approximation error, and it should be 0. We can even fix it:

In [5]:
S[3] = 0; S

3-element Array{Float64,1}: 14.8661 6.245 0.0

To create the $\Sigma$ matrix, we can do for example this:

In [6]:
Σ = zeros(size(A))  # creates a zero matrix of the correct size
Σ[1:length(S),1:length(S)] = diagm(S)  # rewrites the 3 by 3 left corner by the diagonal matrix with diagonal S
Σ

3x4 Array{Float64,2}: 14.8661 0.0 0.0 0.0 0.0 6.245 0.0 0.0 0.0 0.0 0.0 0.0

Verify that this really worked:

In [7]:
U*Σ*V'

3x4 Array{Float64,2}: 2.0 3.0 -1.0 5.0 3.0 7.0 4.0 -2.0 5.0 10.0 3.0 3.0
In [8]:
A

3x4 Array{Int64,2}: 2 3 -1 5 3 7 4 -2 5 10 3 3

They seem to be the same. Note that they are actually not exactly the same, due to rounding errors and numerical approximations. For example, while the first entry of the $U\Sigma V^t$ matrix shows as 2.0, it is actually a tiny bit off:

In [9]:
(U*Σ*V')[1,1]

1.9999999999999996

Because if that, we cannot actually compare the matrices:

In [10]:
U*Σ*V' == A

false
In [11]:
U*Σ*V' .== A

3x4 BitArray{2}: false false false true true false false false false false false false

We can see that few of the entries are exactly the same, but most of them a little bit different.

## Fundamental spaces

The columns of $U$ and $V$ form bases of the four fundamental spaces of $A$.

From the number of non-zero singular values we see that the rank of $A$ is 2.

### Kernel

The last $4 - 2 = 2$ columns of $V$ form an orthonormal basis of the kernel of $A$:

In [12]:
K1 = V[:,3] # all rows, third column

4-element Array{Float64,1}: -0.23022 0.454084 -0.791292 -0.338621
In [13]:
K2 = V[:,4] # all rows, fourth column

4-element Array{Float64,1}: -0.877797 0.272674 0.305473 0.248609

Multiplying $A$ by $K_1$ and $K_2$ should give us zero vectors:

In [14]:
A*K1

3-element Array{Float64,1}: -2.22045e-16 -1.77636e-15 -1.33227e-15
In [15]:
A*K2

3-element Array{Float64,1}: 8.88178e-16 4.44089e-16 1.11022e-15

Again, we see very tiny values that should be zeros, but are slightly off due to rounding and numerical approximations.

### Range

The first two columns of $U$ form an orthonormal basis of the range of $A$ (a.k.a. the columns space of $A$):

In [16]:
R1 = U[:,1]

3-element Array{Float64,1}: -0.267261 -0.534522 -0.801784
In [17]:
R2 = U[:,2]

3-element Array{Float64,1}: 0.771517 -0.617213 0.154303

Any vector in the range of $A$ can be written as a linear combination of $R_1$ and $R_2$. In fact, since they form an orthonormal basis, we can find the coefficients of the linear combination simply using a dot product:

In [18]:
Y1 = [2;5;7] # this should be in the range
c1 = Y1⋅R1
c2 = Y1⋅R2
c1*R1 + c2*R2

3-element Array{Float64,1}: 2.0 5.0 7.0

If we try this with a vector that is not in the range, it will not work:

In [19]:
Y2 = [2;5;8] # this is not in the range
c1 = Y2⋅R1
c2 = Y2⋅R2
c1*R1 + c2*R2

3-element Array{Float64,1}: 2.33333 5.33333 7.66667

This time $c_1R_1 + c_2R_2$ is not $Y_2$, it is instead the orthogonal projection of $Y_2$ onto the range of $A$.

The other two spaces are the kernel of $A^t$, spanned by the last $3 - 2 = 1$ column of $U$, and the row space of $A$, a.k.a. the range of $A^t$, spanned by the first two columns of $V$:

In [20]:
Kt1 = U[:,3]

3-element Array{Float64,1}: 0.57735 0.57735 -0.57735
In [21]:
Rt1 = V[:,1]

4-element Array{Float64,1}: -0.413493 -0.844963 -0.287647 -0.179779
In [22]:
Rt2 = V[:,2]

4-element Array{Float64,1}: 0.0741249 -0.0741249 -0.44475 0.889499

## Pseudoinverse

Using the s.v.d. we can find the pseudoinverse of $A$. It will be $A^+ = V\Sigma^+U^t$:

In [23]:
Σplus = zeros(4,3)
Σplus[1:2,1:2] = inv(diagm(S[1:2]))
Σplus

4x3 Array{Float64,2}: 0.0672673 0.0 0.0 0.0 0.160128 0.0 0.0 0.0 0.0 0.0 0.0 0.0
In [24]:
V*Σplus*U'

4x3 Array{Float64,2}: 0.0165913 0.00754148 0.0241327 0.00603318 0.0377074 0.0437406 -0.0497738 0.0542986 0.00452489 0.113122 -0.081448 0.0316742

Faster way to get this is using the pinv function:

In [25]:
Aplus = pinv(A)

4x3 Array{Float64,2}: 0.0165913 0.00754148 0.0241327 0.00603318 0.0377074 0.0437406 -0.0497738 0.0542986 0.00452489 0.113122 -0.081448 0.0316742

### Solving systems

We will use the pseudoinverse to find "solutions" of some systems. Start with

\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 7 \end{aligned}\right.

In matrix form this is

$\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\7 \end{bmatrix}$

We already know that the vector on the right hand side is in the range of $A$ (it is the vector $Y_1$ above), so this system will have a solution. Since the dimension of the kernel of $A$ is 2, this system will have infinitely many solutions, in fact, a two-parametric family of solutions.

In [26]:
X1 = Aplus*Y1

4-element Array{Float64,1}: 0.239819 0.506787 0.20362 0.040724

To check that this is a solution of the system, all we need to do is multiply it by $A$:

In [27]:
A*X1

3-element Array{Float64,1}: 2.0 5.0 7.0

Out of all the solutions that the system has, $X_1$ is the one with the lowest norm, and also the unique one that is in the row space of $A$. To see that it really is in the row space of $A$, we will try to write it as a linear combination of $Rt_1$ and $Rt_2$ which form an orthonormal basis of the rowspace. Again, since the basis is orthonormal, we can get the coefficients using dot products:

In [28]:
c1 = X1⋅Rt1
c2 = X1⋅Rt2
c1*Rt1 + c2*Rt2

4-element Array{Float64,1}: 0.239819 0.506787 0.20362 0.040724

We see that $X_1 = c_1Rt_1 + c_2Rt_2$, so $X_1$ is in the row space of $A$.

To get other solutions, we can add any vector from the kernel of $A$ to $X_1$. Since the kernel of $A$ is spanned by $K_1$ and $K_2$, we can get any solution of the system as

$X = X_1 + c_1K_2 + c_2K_2$

for any choice of the scalars $c_1$ and $c_2$. For example,

$X = X_1 + 2K_1 + 3K_2$

will also be a solution:

In [29]:
A*(X1 + 2K1 + 3K2)

3-element Array{Float64,1}: 2.0 5.0 7.0

Let's now look at a system that does not have a solution:

\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 8 \end{aligned}\right.

In matrix form this is

$\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\8 \end{bmatrix}$

We already know that the vector on the right hand side is not in the range of $A$ (it is the vector $Y_2$ above), so this system will not have a solution. Instead, we will find the "least squares solution", which in some sense is the best approximation to a solution.

In [30]:
X2 = Aplus*Y2

4-element Array{Float64,1}: 0.263952 0.550528 0.208145 0.0723982
In [31]:
A*X2

3-element Array{Float64,1}: 2.33333 5.33333 7.66667

We can see that this is not a solution of the system $AX = Y_2$, since we do not get $Y_2$ but only something close to $Y_2$. In fact, when comparing this to the result we obtained above when we introduced $Y_2$ in the first place, we can see that $AX_2$ is in fact the orthogonal projection of $Y_2$ onto the range of $A$. In other words, the closest vector in the range of $A$ to $Y_2$.

Again, this is the least square solution with the smallest norm, or the one that is in the row space of $A$. We can find other least squares solutions of this system by adding linear combinations of $K_1$ and $K_2$.

### Systems with no solutions:

If we wanted to find other right hand sides for the system so that the system either has solutions or has no solutions, we can do this:

If we want the system to have solutions, we need to take the right hand side from the range of $A$. The range is generated by the first two columns of $U$, which we called $R_1$ and $R_2$. So for example with the right hand side

In [32]:
3R1 + 7R2

3-element Array{Float64,1}: 4.59883 -5.92406 -1.32523

the system has infinitely many solutions.

On the other hand, if we make the right hand side a linear combination of the columns of $U$ in which the third column, which we called $Kt_1$, has a non-zero coefficient, the resulting system will have no solutions:

In [33]:
3R1 + 7R2 + 4Kt1

3-element Array{Float64,1}: 6.90823 -3.61466 -3.63463