CoCalc -- 2018-02-02-solutions.ipynb

| Download

This repository contains the course materials from Math 157: Intro to Mathematical Software.

Creative Commons BY-SA 4.0 license.

Project: Support and Testing

Path: math157 / solutions / 2018-02-02 / 2018-02-02-solutions.ipynb

Views: ³⁰³⁷
License: OTHER

Kernel: Octave

Math 157: Intro to Mathematical Software

UC San Diego, winter 2018

Homework 3: due February 2, 2018

Please enter all answers within this notebook unless otherwise specified. As usual, don't forget to cite sources and collaborators.

Through this problem set, use the SageMath 8.1 kernel unless otherwise specified.

Problem 1: Gradient vector fields

Grading criterion: correctness of code.

1a. Compute the gradient of $f(x,y) = 3\sin(x) - 2\cos(2y) - x - y$ .

In [12]:

x,y=var('x,y')
f=3*sin(x)-2*cos(2*y)-x-y
g=f.gradient()
show(g)

1b. Plot the 2-dimensional vector field defined by the gradient of $f$ in the rectangle $(-2,-2) \leq (x,y) \leq (2,2)$ .

In [13]:

p=plot_vector_field(g,(x,-2,2),(y,-2,2))
show(p)

Problem 2: 3D plotting

Grading criterion: correctness of code.

2a. Draw a 3D plot of a torus.

In [14]:

#Builtin torus plot
from sage.plot.plot3d.shapes import Torus
inner_radius = .3; outer_radius = 1
show(Torus(outer_radius, inner_radius,aspect_ratio=1))

In [15]:

#Torus with parametric 3d plot
x,y=var('x,y')
a=(cos(x)+3)*cos(y)
b=(cos(x)+3)*sin(y)
c=sin(x)
parametric_plot3d([a, b, c], (x, 0, 2*pi), (y, 0, 2*pi), color="green",aspect_ratio=1)

2b. Draw a single 3D plot containing the five regular polytopes in it: tetrahedron, cube, octahedron, dodecahedron, icosahedron. All five must be visible.

In [17]:

a=cube().translate(0,0,0)
b=tetrahedron().translate(2,0,0)
c=octahedron().translate(4,0,0)
d=dodecahedron().translate(6,0,0)
e=icosahedron().translate(8,0,0)
show(a+b+c+d+e)

2c. Draw a 3D plot of the Mexican hat function. Try to choose the parameter $\sigma$ to get an authentic "sombrero" look.

In [18]:

x,y=var('x,y')

sigma=.4
plot3d(1.0/pi*1.0/sigma^2.0*(1.0-.5*((x^2.0+y^2.0)/sigma^2.0))*exp((x^2.0+y^2.0)/(-2.0*sigma^2.0)), (x, -2, 2), (y, -2, 2))

Problem 3: MATLAB (and Octave) vs. Sage

Grading criterion: correctness of code.

This exercise refers to the Math 18 MATLAB exercise set.

3a. Do MATLAB exercise 4.5 twice: once using Octave, and a second time using Sage. (For this problem, you will need to switch the kernel between Octave and SageMath.)

In [1]:

P = [0.8100 0.0800 0.1600 0.1000;
0.0900 0.8400 0.0500 0.0800;
0.0600 0.0400 0.7400 0.0400;
0.0400 0.0400 0.0500 0.7800]

x0=[0.5106; 0.4720; 0.0075; 0.0099]

[Q,D] = eig(P)

Q^(-1)*P*Q  #Verifying we get the right answer; note this is D, up to numerical rounding errors.

#To compute the limit of D^n as n goes to infinity, note that D=diag(1,.672,.760,.737), so that D^n=diag(1^n,.672^n,.760^n,.737^n). Thus we will have 
# D^n -> diag(1,0,0,0). This can be approximated numerically by taking n somewhat large, as below:

D^30

Dinf=[1 0 0 0;0 0 0 0;0 0 0 0;0 0 0 0]

Pinf=Q*Dinf*Q^(-1)

Pinf*x0

y=[.25;.1;.4;.25]

Pinf*y

#The answer remains unchanged, regardless of y. Note that the rows of Pinf are constants; thus, if y=[a;b;c;d] with a+b+c+d=1, we have 
#Pinf*y=[.354(a+b+c+d);...]=[.354(1);...]=[.354;...]

P =

   0.810000   0.080000   0.160000   0.100000
   0.090000   0.840000   0.050000   0.080000
   0.060000   0.040000   0.740000   0.040000
   0.040000   0.040000   0.050000   0.780000

x0 =

   0.5106000
   0.4720000
   0.0075000
   0.0099000

Q =

   0.665624   0.767579   0.543214  -0.464102
   0.616521  -0.284124  -0.814822  -0.125380
   0.294620  -0.568248   0.181071  -0.250760
   0.300076   0.084793   0.090536   0.840243

D =

Diagonal Matrix

   1.00000         0         0         0
         0   0.67298         0         0
         0         0   0.76000         0
         0         0         0   0.73702

ans =

   1.0000e+00  -2.5476e-16   1.8306e-16  -1.0952e-16
   6.5954e-17   6.7298e-01  -5.0609e-16   3.9416e-16
  -2.2671e-16  -1.1194e-16   7.6000e-01  -4.4565e-16
  -5.2261e-17   5.8661e-17  -1.9013e-16   7.3702e-01

ans =

Diagonal Matrix

   1.0000e+00            0            0            0
            0   6.9207e-06            0            0
            0            0   2.6571e-04            0
            0            0            0   1.0575e-04

Dinf =

   1   0   0   0
   0   0   0   0
   0   0   0   0
   0   0   0   0

Pinf =

   0.35465   0.35465   0.35465   0.35465
   0.32849   0.32849   0.32849   0.32849
   0.15698   0.15698   0.15698   0.15698
   0.15988   0.15988   0.15988   0.15988

ans =

   0.35465
   0.32849
   0.15698
   0.15988

y =

   0.25000
   0.10000
   0.40000
   0.25000

ans =

   0.35465
   0.32849
   0.15698
   0.15988

In [85]:

#Everything is identical as above, modulo small changes in syntax and function names
P=Matrix(CDF,[[.8100,.0800,.1600,.1000],[.0900,.8400,.0500,.0800],[.0600,.0400,.7400,.0400],[.0400,.0400,.0500,.7800]])
x0 = vector([0.5106, 0.4720, 0.0075, 0.0099])
D=P.eigenmatrix_right()[0]
Q=P.eigenmatrix_right()[1]
print('D=')
print(D)
print('Q=')
print(Q)
L=diagonal_matrix([1.0,0.0,0.0,0.0])
print('L=')
print(L)
Pinf=Q*L*Q^(-1)
print('Pinf=')
print(Pinf)
print('Pinf*x0=')
print(Pinf*x0)
y=vector([.25,.1,.4,.25])
print('Pinf*y=')
print(Pinf*y)
#The explanation as to why Pinf*y is the same as Pinf*x0 is the same as given before.

D=
[1.0000000000000004                0.0                0.0                0.0]
[               0.0 0.6729843788128368                0.0                0.0]
[               0.0                0.0 0.7599999999999995                0.0]
[               0.0                0.0                0.0 0.7370156211871641]
Q=
[  0.6656239985873295   0.7675789757982624  -0.5432144762551129  -0.4641024066208631]
[  0.6165205888554787  -0.2841241259938047    0.814821714382666 -0.12538014809320114]
[  0.2946204583911127   -0.568248251987594 -0.18107149208503445   -0.250760296186409]
[ 0.30007639280576415  0.08479340218313634 -0.09053574604252085   0.8402428509004729]
L=
[ 1.00000000000000 0.000000000000000 0.000000000000000 0.000000000000000]
[0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000]
[0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000]
[0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000]
Pinf=
[  0.354651162790697  0.3546511627906978  0.3546511627906968 0.35465116279069725]
[ 0.3284883720930234  0.3284883720930241 0.32848837209302323  0.3284883720930236]
[0.15697674418604607  0.1569767441860464 0.15697674418604599 0.15697674418604618]
[0.15988372093023273 0.15988372093023306 0.15988372093023265 0.15988372093023284]
Pinf*x0=
(0.3546511627906974, 0.32848837209302373, 0.15697674418604624, 0.1598837209302329)
Pinf*y=
(0.3546511627906971, 0.32848837209302345, 0.15697674418604612, 0.1598837209302328)

3b. Repeat with MATLAB exercise 5.6, using numpy to obtain the analogue of MATLAB's least squares functionality.

In [5]:

B=[1 75;1 100;1 128; 1 159; 1 195]
d=[15;23;26;34;38]
[Q,R]=qr(B,0)
x = Q(:, 1)
y = Q(:, 2)
v = dot(x,d)*x + dot(y,d)*y
c = B\v
B*c - v
[cl,a,b] = lscov(B, d, eye(5))

B =

     1    75
     1   100
     1   128
     1   159
     1   195

d =

   15
   23
   26
   34
   38

Q =

  -0.447214  -0.594998
  -0.447214  -0.331258
  -0.447214  -0.035869
  -0.447214   0.291169
  -0.447214   0.670955

R =

    -2.23607  -293.81933
     0.00000    94.79029

x =

  -0.44721
  -0.44721
  -0.44721
  -0.44721
  -0.44721

y =

  -0.594998
  -0.331258
  -0.035869
   0.291169
   0.670955

v =

   16.538
   21.264
   26.557
   32.418
   39.223

c =

   2.35959
   0.18904

ans =

   3.5527e-15
   0.0000e+00
   3.5527e-15
   1.4211e-14
   1.4211e-14

cl =

   2.35959
   0.18904

a =

   2.617713
   0.018959

b =  3.2298

In [11]:

import numpy as np

B=matrix(CDF,[[1, 75],[1, 100],[1, 128],[1,159],[1,195]]) #Set up B, d
d=vector([15,23,26,34,38])

Q,R=B.QR()                                    #Find orthonormal basis for the column space of B
x=vector(Q[:,0])
y=vector(Q[:,1])

v=x.dot_product(d)*x+y.dot_product(d)*y     #Project d onto the column space of B to get v

vshort=vector([v[0],v[1]])   #Here is the trick; every two by two submatrix of B is invertible, since each pair of rows in B is lin independent. 
                             #Hence to solve Bc=v, it is completely equivalent to take the top 2x2 submatrix of B, call it Bshort, and the top 2 
                             #entries of v, say vshort, and solve Bshort*c=vshort. Convince yourself that the solution c is the same as the c in Bc=v
Bshort=Matrix(CDF,[[1,75],[1,100]])

c=Bshort.solve_right(vshort)   
 
#Because Bshort is now a square matrix, we can use sage's solve_right command to get the required answer. You can verify
#that this agrees with the octave answer. 
print(c)

#Verifying with numpy:
print(np.linalg.lstsq(B,d)[0])

(2.3595913279615033, 0.18904420602769023)
[ 2.35959133+0.j  0.18904421+0.j]

Problem 4: Eigenvectors and eigenvalues

Grading criteria: correctness of code and explanations.

Let $M$ be the following matrix with rational entries: $M = \left(\begin{array}{rrrr} \frac{1}{2} & 0 & -1 & 0 \\ 1 & \frac{1}{2} & 1 & 1 \\ 0 & 1 & -1 & -1 \\ 2 & 1 & -2 & 1 \end{array}\right)$

4a. State the Cayley-Hamilton theorem, then use Sage to verify that it holds for $M$ .

The Cayley-Hamilton theorem states that $M$ satisfies its characteristic polynomial; that is, if $f(x)=det(x\cdot I-M)$ , then $f(M)=0$ .

Source:

https://en.wikipedia.org/wiki/Cayley–Hamilton_theorem

In [99]:

#Verification:
M=matrix(CDF,[[1/2,0,-1,0],[1,1/2,1,1],[0,1,-1,-1],[2,1,-2,1]])
c=M.charpoly()
c(M)

[0.0 0.0 0.0 0.0]
[0.0 0.0 0.0 0.0]
[0.0 0.0 0.0 0.0]
[0.0 0.0 0.0 0.0]

4b. Compute the eigenvalues and eigenvectors of $M$ , and verify (using numerical approximations) that each eigenvector is indeed an eigenvector with the specified eigenvalue.

In [107]:

evals=[i[0] for i in M.right_eigenvectors()]
evects=[i[1][0] for i in M.right_eigenvectors()]
for i in range(4):
    print(str(evals[i])+' is an eigenvalue with eigenvector '+str(evects[i]))
for i in range(4):
    print('Eigenvector '+str(i+1)+' verification')
    print(M*evects[i])
    print(evals[i]*evects[i])

2.03354622791 is an eigenvalue with eigenvector (0.06804366661981474, 0.5252410268111855, -0.10434810827783073, 0.8417858370667097)
0.267172875651 is an eigenvalue with eigenvector (-0.6033085821062105, 0.4589286728617739, -0.14046660226706817, 0.636924141189412)
-2.30071910356 is an eigenvalue with eigenvector (0.2538601633286291, -0.49718243018224134, 0.7109910090669525, 0.42761715776942244)
1.0 is an eigenvalue with eigenvector (-0.26490647141300844, 0.7947194142390258, 0.13245323570650383, 0.5298129428260185)
Eigenvector 1 verification
(0.1383699415877381, 1.0681019088142865, -0.21219670197769355, 1.711810413673186)
(0.13836994158773738, 1.0681019088142845, -0.21219670197769308, 1.711810413673183)
Eigenvector 2 verification
(-0.1611876887860371, 0.1226132932470202, -0.037528866060569904, 0.1701688543729012)
(-0.1611876887860372, 0.1226132932470201, -0.0375288660605698, 0.17016885437290075)
Eigenvector 3 verification
(-0.5840609274026379, 1.1438771150738833, -1.6357905970186162, -0.9838269638894657)
(-0.5840609274026379, 1.1438771150738838, -1.6357905970186162, -0.9838269638894657)
Eigenvector 4 verification
(-0.26490647141300805, 0.7947194142390269, 0.13245323570650344, 0.5298129428260197)
(-0.26490647141300844, 0.7947194142390258, 0.13245323570650383, 0.5298129428260185)

4c. State the relationship between the characteristic polynomial, the determinant, and the eigenvalues of a matrix; then verify numerically that this holds for $M$ .

Let $M$ be an $n\times n$ matrix, $c(x)$ its characteristic polynomial, $det(M)$ its determinant, and $[\lambda_1,\dots,\lambda_n]$ a list of its eigenvalues (with multiplicity). Then we have: $c(0)=(-1)^n\cdot det(M)$ $c(x)=\prod_{i=1}^n(x-\lambda_i)$ $(-1)^n\cdot\det(M)=\prod_{i=1}^n\lambda_i$

In [126]:

#Verification:
print('Verifying formula 1:')
print(c(0))
print((-1)^4*M.determinant())
print('Verifying formula 2:')
g=1
for i in evals:
    g=g*(x-i)
print(expand(g))
print(c)
print('Verifying formula 3:')
d=1
for i in evals:
    d=d*i
print(d)
print((-1)^4*M.determinant())

Verifying formula 1:
-1.25
-1.25
Verifying formula 2:
x^4 - 0.9999999999999956*x^3 - 4.7499999999999964*x^2 + 5.999999999999989*x - 1.2499999999999973
x^4 - x^3 - 4.75*x^2 + 6.0*x - 1.25
Verifying formula 3:
-1.25
-1.25

Problem 5: Hilbert matrices and numerical stability

Grading criterion: correctness of code and thoroughness of explanation.

5a. Define a Python function f, which on the input of a positive integer $n$ , returns the $n \times n$ Sage matrix over the rational numbers $H_{ij} = \frac{1}{i+j-1} \qquad (i,j=1,\dots,n).$ See below for a sample output. Hint: remember that Sage, like Python, starts indexing with 0 instead of 1.

In [48]:

def f(n):
    try:                                            #The try...except method is a useful way to implement control over the input for your function. 
                                                    #Here we implement control which returns a error if the input is not in fact a positive integer
        a=float(n)
        
        if a.is_integer() and a>0:
            b=int(a)
            H=matrix(QQ,b,b,lambda x,y:1/(x+y+1))
            return(H)
        else:
            raise ValueError
    except ValueError:
        raise ValueError
    
print('Example input of n=6:')
print(f(6))

Example input of n=6:
[   1  1/2  1/3  1/4  1/5  1/6]
[ 1/2  1/3  1/4  1/5  1/6  1/7]
[ 1/3  1/4  1/5  1/6  1/7  1/8]
[ 1/4  1/5  1/6  1/7  1/8  1/9]
[ 1/5  1/6  1/7  1/8  1/9 1/10]
[ 1/6  1/7  1/8  1/9 1/10 1/11]

## Sample output of f(3):
[  1 1/2 1/3]
[1/2 1/3 1/4]
[1/3 1/4 1/5]

5b. Write Sage code to compute the inverse of f(25), print out the top left entry (not the whole matrix), and verify that the whole answer agrees with the formula in Wikipedia.

In [49]:

H25=f(25)
inverse=H25^(-1)
print('Entry computed with our function:')
print(inverse[0,0])
print('Entry computed with wikipedia formula:')#Many people made the mistake of plugging in i=j=0; recall that sage indexes starting at 0, whereas
                                               #most math sources (including wikipedia) start at 1. Hence we really want i=j=1 for the wikipedia eq.
a=(-1)^(1+1)*(1+1-1)*binomial(25+1-1,25-1)*binomial(25+1-1,25-1)*binomial(1+1-2,1-1)^2
print(a)

Entry computed with our function:
625
Entry computed with wikipedia formula:
625

5c. Use the change_ring method to redo the computation of the inverse of f(25) over the ring RR (i.e., using floating-point real numbers with double precision == 53 bits), agian printing out the top left entry.

In [50]:

real_H25=f(25).change_ring(RR)
real_inverse=real_H25^(-1)
real_inverse[0,0]

-202.301237192348

5d. In as much detail as you can, explain what you have just observed. You may want to compute the determinant of f(25) and/or read about numerical stability.

In [51]:

print('determinant when the matrix is initially considered over RR:')
print(det(real_H25))
print('')
print('determinant when the matrix is initially considered over rationals:')
print(det(H25))
print('')
print('numerical approximation of determinant when the matrix is initially considered over the rationals:')
print(det(H25).n())

determinant when the matrix is initially considered over RR:
2.91038304567337e-11

determinant when the matrix is initially considered over rationals:
1/746332515470541859307349018107431789872329747913727730760621191984777569063589205517303938509441565299809996599554994981363939681729476785432795097905555120762172958990784084457536372476333251919350316666089029356479086324949498677522696089012012028938712165715166669721997554754957779660912439584392373862400000000000000000000000000000000000000000000000000

numerical approximation of determinant when the matrix is initially considered over the rationals:
1.33988534503220e-357

Sources:

https://math.stackexchange.com/questions/1622610/when-is-inverting-a-matrix-numerically-unstable

https://en.wikipedia.org/wiki/Operator_norm#Table_of_common_operator_norms

http://web.mit.edu/ehliu/Public/Yelp/conditioning_and_precision.pdf

https://www.mathworks.com/help/symbolic/examples/hilbert-matrices-and-their-inverses.html

The Hilbert matrix is what is called "ill conditioned," meaning that the matrix is very close to being singular. See, for instance, the determinant of the matrix, which is very close to zero. In particular, if we want sage to use floating point arithmetic, we should expect rounding errors to occur in the computation of real_H25^(-1). These rounding errors will compound with each operation that is performed in computing the inverse, leading to at best an approximation of what the real answer should be. Note that if you compute the product of real_H25 with its inverse, you do not get the identity. To save space, I'll simply print the first row below (note how the off diagonal entries are not even close to zero!!!). This should show that the computed inverse to real_H25 is a terrible approximation. In particular, to get an exact answer we should try to compute things symbolically.

In [52]:

for k in range(25):
    print((real_H25*real_inverse)[0,k])

0.996933907270432
0.459764480590820
-16.8348999023438
259.499511718750
-2045.50781250000
8745.09375000000
-18582.1250000000
5901.00000000000
58590.6005859375
-119485.812500000
84335.6250000000
-58311.6250000000
162770.250000000
-105992.500000000
-245995.500000000
322674.250000000
7818.50000000000
-70267.5000000000
-157862.500000000
107697.250000000
170158.250000000
-310291.000000000
256112.750000000
-121821.500000000
25431.6250000000

If instead of working over RR we tell sage to work over the rational numbers, we can compute everything symbolically, and hence are not subject to any rounding errors. In particular, we may precisely calculate the inverse of H25; note the calculation below:

In [53]:

print(H25*inverse)

[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]

Problem 6: Linear algebra over Q

Grading criteria: correctness of code and thoroughness of explanation.

6a. Explain in detail what the following code does and how it is being done. Assume that the input M is a matrix with entries in QQ.

In [54]:

def f(M):
    M = copy(M) # replace M with a copy to avoid clobbering the original
    n = M.dimensions()[0]
    if n != M.dimensions()[1]: raise ValueError
    for i in range(n):
        j = i
        while j<n and M[j,i] == 0: j += 1
        if j==n: return(0)
        if i<j:
            M.swap_rows(i,j)
            M.rescale_row(i, -1)
        for k in range(i+1,n):
            M.add_multiple_of_row(k,i,-M[k,i]/M[i,i])
    return(prod(M[i,i] for i in range(n)))

The function $f$ works in several steps, with the ultimate goal of computing the determinant of $M$ . The first step is to verify that the input, $M$ , is indeed a square matrix:

    n = M.dimensions()[0]
    if n != M.dimensions()[1]: raise ValueError

Next, $f$ iterates through the columns of $M$ . The process will be to put $M$ into upper triangular form first, and then compute its determinant by taking the product of the diagonal entries:

    for i in range(n):

In the $i$ th column, $f$ starts by looking at the $(i,i)$ entry of $M$ and scans downward, looking for a nonzero entry below the diagonal of $M$ . Once we find a nonzero entry $M[j,i]$ , or reach the bottom of the column without finding a nonzero entry, the loop quits:

        j = i
        while j<n and M[j,i] == 0: j += 1

If there is no nonzero entry below the diagonal in column $i$ , the function quits and returns $0$ ; this is in order to save computation time when the matrix has determinant zero. (Note once we have a matrix in upper triangular form, its determinant is the product of the diagonal entries; thus if we have a zero on the diagonal, we will get the same result of zero anyways):

    if j==n: return(0)

If the first nonzero entry in column $i$ that is weakly below the diagonal is in fact strictly below the diagonal, we switch rows $i$ and $j$ in order to put a nonzero entry on the diagonal in column $i$ . Note that this has the effect of scaling the determinant by $-1$ , so to account for this, we scale row $i$ by $-1$ to keep the determinant invariant. It is crucial to note that the subdiagonal entries in column $k$ , for $k<i$ , are all $0$ , so that doing this process does not "cancel out" the work we have done in previous steps:

        if i<j:
            M.swap_rows(i,j)
            M.rescale_row(i, -1)

Once we have a nonzero entry on the diagonal in column $i$ , we zero out the entries strictly below the diagonal in column $i$ . Recall from elementary linear algebra that this does not affect the determinant.

        for k in range(i+1,n):
            M.add_multiple_of_row(k,i,-M[k,i]/M[i,i])

Once we have iterated through all columns of $M$ , the matrix will be in upper triangular form (unless we have a zero on the diagonal, in which case we quit early as mentioned before). To finish, the function calculates the determinant of $M$ by simply taking the product of the diagonal entries.

    return(prod(M[i,i] for i in range(n)))

6b. Define a Python function g that, on input a positive integer n, returns an $n \times n$ matrix in which each entry is a Sage integer is chosen uniformly at random among 0 and 1. (Hint: use the randint function, but read the docstring carefully!)

Then check that on the output of g(25), the function f agrees with the native Sage way of doing the same computation.

In [59]:

def g(n):
    try:
        a=float(n)
        if a.is_integer() and a>0:
            M=matrix(QQ,n,n,lambda x,y: randint(0,1))
            return(M)
        else:
            raise ValueError
    except ValueError:
        raise ValueError

random25=g(25)
print('Using the function f:')
print(f(random25))
print('Using the Sage builtin:')
print(random25.det())

Using the function f:
-66050
Using the Sage builtin:
-66050

6c. Define the binary height of a nonzero rational number $\frac{r}{s}$ , written in lowest terms, to be the quantity $\left\lfloor\log_2 r \right\rfloor + 1 + \left\lfloor \log_2 s \right\rfloor + 1.$ Use the function binary_height defined below to compute this.

Take the code defining f, recopy it below, then modify it so that at the last step, instead of returning the product of the diagonal entries of M, it returns the maximum of the heights of the diagonal entries.

In [60]:

def binary_height(x):
    return(x.numer().nbits() + x.denom().nbits())

In [65]:

def f_modified(M):
    M = copy(M) # replace M with a copy to avoid clobbering the original
    n = M.dimensions()[0]
    if n != M.dimensions()[1]: raise ValueError
    for i in range(n):
        j = i
        while j<n and M[j,i] == 0: j += 1
        if j==n: return(0)
        if i<j:
            M.swap_rows(i,j)
            M.rescale_row(i, -1)
        for k in range(i+1,n):
            M.add_multiple_of_row(k,i,-M[k,i]/M[i,i])
    return(max([binary_height(M[l,l])for l in range(n)]))

6d. Explain, in words, the output of the following code. (It should run without errors!)

In [72]:

M = g(25).change_ring(QQ)
print(binary_height(f(M)))
print(f_modified(M))

16
29

If you run the code several times, you will note that f_modified(M) is usually at least $\approx 1.5$ times as large as binary_height(f(M)). A few heuristics for why this should be: first, the determinant of the random matrix g(25) is an integer, since g(25) is itself a matrix with integer entries. Since integers have denominator 1, their binary height is in general much smaller than a rational number of similar size. Second, when reducing the matrix to upper triangular form, large denominators are introduced into the lower right diagonal entries when we perform the operation

        for k in range(i+1,n):
            M.add_multiple_of_row(k,i,-M[k,i]/M[i,i])

And thus the matrix that $f$ uses to calculate the determinant will have rational numbers of large height on its diagonal. It is not true in general that binary_height(f(M)) is always larger than f_modified(M), but on the average case the inequality will hold. See below for a counterexample:

In [76]:

A=diagonal_matrix([2/1,2/1,2/1,2/1,2/1,2/1,2/1,2/1])
print(binary_height(f(A)))
print(f_modified(A))

10
3

6e. Sage does not use the method illustrated by the definition of f for matrices of integers or rational numbers; instead, it uses an approach based on the Chinese remainder theorem, in which one does the analogous computation modulo $p$ for a few primes $p$ . Explain why you think this decision was made.

As we can see from the above examples, $f$ will usually introduce rational numbers of large height into the matrix, which will take up extra memory and use more computing time (especially when the dimension of the input matrix is large). When we use the Chinese remainder theorem, the entries of a matrix are reduced mod $p$ , and hence have size bounded above by $p$ . This probably decreases memory needs and increases speed.

Math 157: Intro to Mathematical Software

UC San Diego, winter 2018

Homework 3: due February 2, 2018

Problem 1: Gradient vector fields

Problem 2: 3D plotting

Problem 3: MATLAB (and Octave) vs. Sage

Problem 4: Eigenvectors and eigenvalues

Problem 5: Hilbert matrices and numerical stability

Problem 6: Linear algebra over Q

Product

Resources

Company