Class example of singular value decomposition

Project: Teaching

Views: ¹⁴⁹

Kernel: Julia

Example of Singular Value Decomposition

This is a similar example to the one we did in class. Unfortunately the example we did in class did not save (the notebooks should save automatically but something went wrong). I do not remember the exact values I used in class, but I will try to make my matrix close to the one from class.

Start with a $3\times 4$ matrix or rank 2. The third row is simply the sum of the first and second:

In [1]:

A = [2 3 -1 5
        3 7 4 -2
        5 10 3 3]

3x4 Array{Int64,2}:
 3  -1   5
 7   4  -2
10   3   3

In [2]:

rank(A)

2

Calculate the s.v.d. and save the results for future use:

In [3]:

(U,S,V) = svd(A, thin=false)

(
3x3 Array{Float64,2}:
 -0.267261   0.771517   0.57735
 -0.534522  -0.617213   0.57735
 -0.801784   0.154303  -0.57735,

[14.866068747318506,6.244997998398398,3.109246380423892e-16],
4x4 Array{Float64,2}:
 -0.413493   0.0741249  -0.23022   -0.877797
 -0.844963  -0.0741249   0.454084   0.272674
 -0.287647  -0.44475    -0.791292   0.305473
 -0.179779   0.889499   -0.338621   0.248609)

The vector S contains the singular values:

In [4]:

3-element Array{Float64,1}:
8661     
245      
10925e-16

The third singular value is very close to 0, it is surely a roundin/approximation error, and it should be 0. We can even fix it:

In [5]:

S[3] = 0; S

3-element Array{Float64,1}:
8661
245 
0   

To create the $\Sigma$ matrix, we can do for example this:

In [6]:

Σ = zeros(size(A))  # creates a zero matrix of the correct size
Σ[1:length(S),1:length(S)] = diagm(S)  # rewrites the 3 by 3 left corner by the diagonal matrix with diagonal S
Σ

3x4 Array{Float64,2}:
8661  0.0    0.0  0.0
0     6.245  0.0  0.0
0     0.0    0.0  0.0

Verify that this really worked:

In [7]:

U*Σ*V'

3x4 Array{Float64,2}:
0   3.0  -1.0   5.0
0   7.0   4.0  -2.0
0  10.0   3.0   3.0

In [8]:

3x4 Array{Int64,2}:
 3  -1   5
 7   4  -2
10   3   3

They seem to be the same. Note that they are actually not exactly the same, due to rounding errors and numerical approximations. For example, while the first entry of the $U\Sigma V^t$ matrix shows as 2.0, it is actually a tiny bit off:

In [9]:

(U*Σ*V')[1,1]

1.9999999999999996

Because if that, we cannot actually compare the matrices:

In [10]:

U*Σ*V' == A

false

In [11]:

U*Σ*V' .== A

3x4 BitArray{2}:
 false  false  false   true
  true  false  false  false
 false  false  false  false

We can see that few of the entries are exactly the same, but most of them a little bit different.

Fundamental spaces

The columns of $U$ and $V$ form bases of the four fundamental spaces of $A$ .

From the number of non-zero singular values we see that the rank of $A$ is 2.

Kernel

The last $4 - 2 = 2$ columns of $V$ form an orthonormal basis of the kernel of $A$ :

In [12]:

K1 = V[:,3] # all rows, third column

4-element Array{Float64,1}:
 -0.23022 
  0.454084
 -0.791292
 -0.338621

In [13]:

K2 = V[:,4] # all rows, fourth column

4-element Array{Float64,1}:
 -0.877797
  0.272674
  0.305473
  0.248609

Multiplying $A$ by $K_1$ and $K_2$ should give us zero vectors:

In [14]:

A*K1

3-element Array{Float64,1}:
 -2.22045e-16
 -1.77636e-15
 -1.33227e-15

In [15]:

A*K2

3-element Array{Float64,1}:
88178e-16
44089e-16
11022e-15

Again, we see very tiny values that should be zeros, but are slightly off due to rounding and numerical approximations.

Range

The first two columns of $U$ form an orthonormal basis of the range of $A$ (a.k.a. the columns space of $A$ ):

In [16]:

R1 = U[:,1]

3-element Array{Float64,1}:
 -0.267261
 -0.534522
 -0.801784

In [17]:

R2 = U[:,2]

3-element Array{Float64,1}:
  0.771517
 -0.617213
  0.154303

Any vector in the range of $A$ can be written as a linear combination of $R_1$ and $R_2$ . In fact, since they form an orthonormal basis, we can find the coefficients of the linear combination simply using a dot product:

In [18]:

Y1 = [2;5;7] # this should be in the range
c1 = Y1⋅R1
c2 = Y1⋅R2
c1*R1 + c2*R2

3-element Array{Float64,1}:
0
0
0

If we try this with a vector that is not in the range, it will not work:

In [19]:

Y2 = [2;5;8] # this is not in the range
c1 = Y2⋅R1
c2 = Y2⋅R2
c1*R1 + c2*R2

3-element Array{Float64,1}:
33333
33333
66667

This time $c_1R_1 + c_2R_2$ is not $Y_2$ , it is instead the orthogonal projection of $Y_2$ onto the range of $A$ .

The other two spaces are the kernel of $A^t$ , spanned by the last $3 - 2 = 1$ column of $U$ , and the row space of $A$ , a.k.a. the range of $A^t$ , spanned by the first two columns of $V$ :

In [20]:

Kt1 = U[:,3]

3-element Array{Float64,1}:
  0.57735
  0.57735
 -0.57735

In [21]:

Rt1 = V[:,1]

4-element Array{Float64,1}:
 -0.413493
 -0.844963
 -0.287647
 -0.179779

In [22]:

Rt2 = V[:,2]

4-element Array{Float64,1}:
  0.0741249
 -0.0741249
 -0.44475  
  0.889499 

Pseudoinverse

Using the s.v.d. we can find the pseudoinverse of $A$ . It will be $A^+ = V\Sigma^+U^t$ :

In [23]:

Σplus = zeros(4,3)
Σplus[1:2,1:2] = inv(diagm(S[1:2]))
Σplus

4x3 Array{Float64,2}:
0672673  0.0       0.0
0        0.160128  0.0
0        0.0       0.0
0        0.0       0.0

In [24]:

V*Σplus*U'

4x3 Array{Float64,2}:
  0.0165913    0.00754148  0.0241327 
  0.00603318   0.0377074   0.0437406 
 -0.0497738    0.0542986   0.00452489
  0.113122    -0.081448    0.0316742 

Faster way to get this is using the pinv function:

In [25]:

Aplus = pinv(A)

4x3 Array{Float64,2}:
  0.0165913    0.00754148  0.0241327 
  0.00603318   0.0377074   0.0437406 
 -0.0497738    0.0542986   0.00452489
  0.113122    -0.081448    0.0316742 

Solving systems

We will use the pseudoinverse to find "solutions" of some systems. Start with

\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 7 \end{aligned}\right.

In matrix form this is

\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\7 \end{bmatrix}

We already know that the vector on the right hand side is in the range of $A$ (it is the vector $Y_1$ above), so this system will have a solution. Since the dimension of the kernel of $A$ is 2, this system will have infinitely many solutions, in fact, a two-parametric family of solutions.

In [26]:

X1 = Aplus*Y1

4-element Array{Float64,1}:
239819
506787
20362 
040724

To check that this is a solution of the system, all we need to do is multiply it by $A$ :

In [27]:

A*X1

3-element Array{Float64,1}:
0
0
0

Out of all the solutions that the system has, $X_1$ is the one with the lowest norm, and also the unique one that is in the row space of $A$ . To see that it really is in the row space of $A$ , we will try to write it as a linear combination of $Rt_1$ and $Rt_2$ which form an orthonormal basis of the rowspace. Again, since the basis is orthonormal, we can get the coefficients using dot products:

In [28]:

c1 = X1⋅Rt1
c2 = X1⋅Rt2
c1*Rt1 + c2*Rt2

4-element Array{Float64,1}:
239819
506787
20362 
040724

We see that $X_1 = c_1Rt_1 + c_2Rt_2$ , so $X_1$ is in the row space of $A$ .

To get other solutions, we can add any vector from the kernel of $A$ to $X_1$ . Since the kernel of $A$ is spanned by $K_1$ and $K_2$ , we can get any solution of the system as

X = X_1 + c_1K_2 + c_2K_2

for any choice of the scalars $c_1$ and $c_2$ . For example,

X = X_1 + 2K_1 + 3K_2

will also be a solution:

In [29]:

A*(X1 + 2K1 + 3K2)

3-element Array{Float64,1}:
0
0
0

Let's now look at a system that does not have a solution:

\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 8 \end{aligned}\right.

In matrix form this is

\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\8 \end{bmatrix}

We already know that the vector on the right hand side is not in the range of $A$ (it is the vector $Y_2$ above), so this system will not have a solution. Instead, we will find the "least squares solution", which in some sense is the best approximation to a solution.

In [30]:

X2 = Aplus*Y2

4-element Array{Float64,1}:
263952 
550528 
208145 
0723982

In [31]:

A*X2

3-element Array{Float64,1}:
33333
33333
66667

We can see that this is not a solution of the system $AX = Y_2$ , since we do not get $Y_2$ but only something close to $Y_2$ . In fact, when comparing this to the result we obtained above when we introduced $Y_2$ in the first place, we can see that $AX_2$ is in fact the orthogonal projection of $Y_2$ onto the range of $A$ . In other words, the closest vector in the range of $A$ to $Y_2$ .

Again, this is the least square solution with the smallest norm, or the one that is in the row space of $A$ . We can find other least squares solutions of this system by adding linear combinations of $K_1$ and $K_2$ .

Systems with no solutions:

If we wanted to find other right hand sides for the system so that the system either has solutions or has no solutions, we can do this:

If we want the system to have solutions, we need to take the right hand side from the range of $A$ . The range is generated by the first two columns of $U$ , which we called $R_1$ and $R_2$ . So for example with the right hand side

In [32]:

3R1 + 7R2

3-element Array{Float64,1}:
  4.59883
 -5.92406
 -1.32523

the system has infinitely many solutions.

On the other hand, if we make the right hand side a linear combination of the columns of $U$ in which the third column, which we called $Kt_1$ , has a non-zero coefficient, the resulting system will have no solutions:

In [33]:

3R1 + 7R2 + 4Kt1

3-element Array{Float64,1}:
  6.90823
 -3.61466
 -3.63463