1

This is a similar example to the one we did in class. Unfortunately the example we did in class did not save (the notebooks should save automatically but something went wrong). I do not remember the exact values I used in class, but I will try to make my matrix close to the one from class.

2

Start with a $3\times 4$ matrix or rank 2. The third row is simply the sum of the first and second:

3

In [1]:

A = [2 3 -1 5 3 7 4 -2 5 10 3 3]

4

3x4 Array{Int64,2}:
2 3 -1 5
3 7 4 -2
5 10 3 3

In [2]:

rank(A)

5

2

Calculate the s.v.d. and save the results for future use:

6

In [3]:

(U,S,V) = svd(A, thin=false)

7

(
3x3 Array{Float64,2}:
-0.267261 0.771517 0.57735
-0.534522 -0.617213 0.57735
-0.801784 0.154303 -0.57735,
[14.866068747318506,6.244997998398398,3.109246380423892e-16],
4x4 Array{Float64,2}:
-0.413493 0.0741249 -0.23022 -0.877797
-0.844963 -0.0741249 0.454084 0.272674
-0.287647 -0.44475 -0.791292 0.305473
-0.179779 0.889499 -0.338621 0.248609)

The vector `S`

contains the **singular values**:

8

In [4]:

S

9

3-element Array{Float64,1}:
14.8661
6.245
3.10925e-16

The third singular value is very close to 0, it is surely a roundin/approximation error, and it should be 0. We can even fix it:

10

In [5]:

S[3] = 0; S

11

3-element Array{Float64,1}:
14.8661
6.245
0.0

To create the $\Sigma$ matrix, we can do for example this:

12

In [6]:

Σ = zeros(size(A)) # creates a zero matrix of the correct size Σ[1:length(S),1:length(S)] = diagm(S) # rewrites the 3 by 3 left corner by the diagonal matrix with diagonal S Σ

13

3x4 Array{Float64,2}:
14.8661 0.0 0.0 0.0
0.0 6.245 0.0 0.0
0.0 0.0 0.0 0.0

Verify that this really worked:

14

In [7]:

U*Σ*V'

15

3x4 Array{Float64,2}:
2.0 3.0 -1.0 5.0
3.0 7.0 4.0 -2.0
5.0 10.0 3.0 3.0

In [8]:

A

16

3x4 Array{Int64,2}:
2 3 -1 5
3 7 4 -2
5 10 3 3

They seem to be the same. Note that they are actually not exactly the same, due to rounding errors and numerical approximations. For example, while the first entry of the $U\Sigma V^t$ matrix shows as 2.0, it is actually a tiny bit off:

17

In [9]:

(U*Σ*V')[1,1]

18

1.9999999999999996

Because if that, we cannot actually compare the matrices:

19

In [10]:

U*Σ*V' == A

20

false

In [11]:

U*Σ*V' .== A

21

3x4 BitArray{2}:
false false false true
true false false false
false false false false

We can see that few of the entries are exactly the same, but most of them a little bit different.

22

23

The columns of $U$ and $V$ form bases of the four *fundamental spaces* of $A$.

From the number of non-zero singular values we see that the rank of $A$ is 2.

24

The last $4 - 2 = 2$ columns of $V$ form an orthonormal basis of the **kernel** of $A$:

25

In [12]:

K1 = V[:,3] # all rows, third column

26

4-element Array{Float64,1}:
-0.23022
0.454084
-0.791292
-0.338621

In [13]:

K2 = V[:,4] # all rows, fourth column

27

4-element Array{Float64,1}:
-0.877797
0.272674
0.305473
0.248609

Multiplying $A$ by $K_1$ and $K_2$ should give us zero vectors:

28

In [14]:

A*K1

29

3-element Array{Float64,1}:
-2.22045e-16
-1.77636e-15
-1.33227e-15

In [15]:

A*K2

30

3-element Array{Float64,1}:
8.88178e-16
4.44089e-16
1.11022e-15

Again, we see very tiny values that should be zeros, but are slightly off due to rounding and numerical approximations.

31

The first two columns of $U$ form an orthonormal basis of the **range** of $A$ (a.k.a. the columns space of $A$):

32

In [16]:

R1 = U[:,1]

33

3-element Array{Float64,1}:
-0.267261
-0.534522
-0.801784

In [17]:

R2 = U[:,2]

34

3-element Array{Float64,1}:
0.771517
-0.617213
0.154303

Any vector in the range of $A$ can be written as a linear combination of $R_1$ and $R_2$. In fact, since they form an **orthonormal** basis, we can find the *coefficients* of the linear combination simply using a dot product:

35

In [18]:

Y1 = [2;5;7] # this should be in the range c1 = Y1⋅R1 c2 = Y1⋅R2 c1*R1 + c2*R2

36

3-element Array{Float64,1}:
2.0
5.0
7.0

If we try this with a vector that is *not* in the range, it will not work:

37

In [19]:

Y2 = [2;5;8] # this is not in the range c1 = Y2⋅R1 c2 = Y2⋅R2 c1*R1 + c2*R2

38

3-element Array{Float64,1}:
2.33333
5.33333
7.66667

This time $c_1R_1 + c_2R_2$ is not $Y_2$, it is instead the *orthogonal projection* of $Y_2$ onto the range of $A$.

39

The other two spaces are the kernel of $A^t$, spanned by the last $3 - 2 = 1$ column of $U$, and the *row space* of $A$, a.k.a. the range of $A^t$, spanned by the first two columns of $V$:

40

In [20]:

Kt1 = U[:,3]

41

3-element Array{Float64,1}:
0.57735
0.57735
-0.57735

In [21]:

Rt1 = V[:,1]

42

4-element Array{Float64,1}:
-0.413493
-0.844963
-0.287647
-0.179779

In [22]:

Rt2 = V[:,2]

43

4-element Array{Float64,1}:
0.0741249
-0.0741249
-0.44475
0.889499

44

Using the s.v.d. we can find the *pseudoinverse* of $A$. It will be $A^+ = V\Sigma^+U^t$:

45

In [23]:

Σplus = zeros(4,3) Σplus[1:2,1:2] = inv(diagm(S[1:2])) Σplus

46

4x3 Array{Float64,2}:
0.0672673 0.0 0.0
0.0 0.160128 0.0
0.0 0.0 0.0
0.0 0.0 0.0

In [24]:

V*Σplus*U'

47

4x3 Array{Float64,2}:
0.0165913 0.00754148 0.0241327
0.00603318 0.0377074 0.0437406
-0.0497738 0.0542986 0.00452489
0.113122 -0.081448 0.0316742

Faster way to get this is using the `pinv`

function:

48

In [25]:

Aplus = pinv(A)

49

4x3 Array{Float64,2}:
0.0165913 0.00754148 0.0241327
0.00603318 0.0377074 0.0437406
-0.0497738 0.0542986 0.00452489
0.113122 -0.081448 0.0316742

We will use the pseudoinverse to find "solutions" of some systems. Start with

$\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 7 \end{aligned}\right.$

In matrix form this is

$\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\7 \end{bmatrix}$

We already know that the vector on the right hand side is in the range of $A$ (it is the vector $Y_1$ above), so this system will have a solution. Since the dimension of the kernel of $A$ is 2, this system will have infinitely many solutions, in fact, a two-parametric family of solutions.

50

In [26]:

X1 = Aplus*Y1

51

4-element Array{Float64,1}:
0.239819
0.506787
0.20362
0.040724

To check that this is a solution of the system, all we need to do is multiply it by $A$:

52

In [27]:

A*X1

53

3-element Array{Float64,1}:
2.0
5.0
7.0

Out of all the solutions that the system has, $X_1$ is the one with the lowest norm, and also the unique one that is in the row space of $A$. To see that it really is in the row space of $A$, we will try to write it as a linear combination of $Rt_1$ and $Rt_2$ which form an orthonormal basis of the rowspace. Again, since the basis is orthonormal, we can get the coefficients using dot products:

54

In [28]:

c1 = X1⋅Rt1 c2 = X1⋅Rt2 c1*Rt1 + c2*Rt2

55

4-element Array{Float64,1}:
0.239819
0.506787
0.20362
0.040724

We see that $X_1 = c_1Rt_1 + c_2Rt_2$, so $X_1$ is in the row space of $A$.

56

To get other solutions, we can add any vector from the kernel of $A$ to $X_1$. Since the kernel of $A$ is spanned by $K_1$ and $K_2$, we can get any solution of the system as

$X = X_1 + c_1K_2 + c_2K_2$

for any choice of the scalars $c_1$ and $c_2$. For example,

$X = X_1 + 2K_1 + 3K_2$

will also be a solution:

57

In [29]:

A*(X1 + 2K1 + 3K2)

58

3-element Array{Float64,1}:
2.0
5.0
7.0

Let's now look at a system that does *not* have a solution:

$\left\{\begin{aligned} 2x + 3y -z + 5w &= 2\\ 3x + 7y + 4z -2w &= 5\\ 5x + 10y + 3z + 3w &= 8 \end{aligned}\right.$

In matrix form this is

$\begin{bmatrix} 2 & 3 & -1 & 5\\ 3 & 7 & 4 & -2\\ 5 & 10 & 3 & 3 \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} = \begin{bmatrix} 2\\5\\8 \end{bmatrix}$

We already know that the vector on the right hand side is *not* in the range of $A$ (it is the vector $Y_2$ above), so this system will not have a solution. Instead, we will find the "least squares solution", which in some sense is the best approximation to a solution.

59

In [30]:

X2 = Aplus*Y2

60

4-element Array{Float64,1}:
0.263952
0.550528
0.208145
0.0723982

In [31]:

A*X2

61

3-element Array{Float64,1}:
2.33333
5.33333
7.66667

We can see that this is not a solution of the system $AX = Y_2$, since we do not get $Y_2$ but only something *close to* $Y_2$. In fact, when comparing this to the result we obtained above when we introduced $Y_2$ in the first place, we can see that $AX_2$ is in fact the *orthogonal projection* of $Y_2$ onto the range of $A$. In other words, the **closest** vector in the range of $A$ to $Y_2$.

62

Again, this is the least square solution with the *smallest norm*, or the one that is in the row space of $A$. We can find other least squares solutions of this system by adding linear combinations of $K_1$ and $K_2$.

63

64

If we wanted to find other right hand sides for the system so that the system either has solutions or has no solutions, we can do this:

If we want the system to have solutions, we need to take the right hand side from the *range* of $A$. The range is generated by the first two columns of $U$, which we called $R_1$ and $R_2$. So for example with the right hand side

65

In [32]:

3R1 + 7R2

66

3-element Array{Float64,1}:
4.59883
-5.92406
-1.32523

the system has infinitely many solutions.

On the other hand, if we make the right hand side a linear combination of the columns of $U$ in which the third column, which we called $Kt_1$, has a *non-zero coefficient*, the resulting system will have *no solutions*:

67

In [33]:

3R1 + 7R2 + 4Kt1

68

3-element Array{Float64,1}:
6.90823
-3.61466
-3.63463