Author: John Jeng

# Math 480 - Grading for Homework 3:

Due Friday, April 22, 2016 by 6pm

# HOMEWORK TOTAL: 50 POINTS

• Go through your gradee's homework worksheet and assign a score to each question. Please state an individual questions assigned score in a cell below it, and leave a few words comment.
• State the total score you've assigned for the homework at the top of the worksheet. Put
TOTAL: X/50
in a cell on its own, where X is the total score for this homework.
• For visibility, I recommend any text you add be colored red. You're welcome to copy any of red html text below and paste it in, suitably edited, if you so desire.

## Problem 1: Standard deviation

a. Write a straightforward Python function (not using anything special from numpy or Sage) to compute the standard deviation of a list of numbers.

b. Compare the speed of your program when given as input v = range(1000), v = range(10000) and v = range(100000) with the speed of standard deviation on that input as implemented in: numpy, R, and stats.TimeSeries. Use %timeit.

c. Do a and b again, but for an implementation of mysd using Cython (you do not have to use any special features of Cython).

%html
<font color="red"><h2>20 Points Total</h2>
<h4>Part A (2 points total)</h4>
<p>Award:
<ul>
<li>2 points if their function behaves the same way as the standard deviation function built into sage.</li>
<li>1 </li>
</ul>
</p>
<h4>Part B (6 points total, 2 points per part)</h4>
<p>Award:
<ul>
<li>3 points</li>
<li>2 points</li>
</ul>
</p>
<h4>Part C (12 points total, 3 points per part)</h4>
<p>Award:
<ul>
<li>3 points if they </li>
<li>2 points if they</li>
<li>1 point if they.</li>
<li>0 points for anything else</li>
</ul>
</p>
</font>



## 20 Points Total

#### Part A (2 points total)

Test that their function behaves the same way as the standard deviation function built into sage.

Award:

• 2 points if they did not use anything special outside of vanila Python
• 1

Award:

• 3 points
• 2 points

#### Part C (9 points total, 3 points per part)

Award:

• 3 points if they
• 2 points if they
• 1 point if they.
• 0 points for anything else

def mysd(data):
# Calculate the average
average = 0
for entries in data:
average += entries
average = average/len(data)

# Calculate the deviations from mean, squared
deviations_squared = 0
for entries in data:
deviations_squared += (entries - average)**2

# Calculate the variance
variance = deviations_squared/len(data)

# return the standard deviation
return sqrt(variance)

mysd([1..10]) == std([1..10])

1/2*sqrt(33) == sqrt(55/6)
std([1..10])

sqrt(55/6)
# Example Solution

print "My Times"
mysd(range(1000))
%timeit mysd(range(1000))
%timeit mysd(range(10000))
%timeit mysd(range(100000))
print ""

print "Timings using stats.TimeSeries"
t = stats.TimeSeries(range(1000))
r = stats.TimeSeries(range(10000))
s = stats.TimeSeries(range(100000))
t.standard_deviation()
%timeit r.standard_deviation()
%timeit t.standard_deviation()
%timeit s.standard_deviation()
print ""

print "Timings using Numpy"
import numpy
numpy.std(range(1000), ddof=1)
%timeit numpy.std(range(1000), ddof=1)
%timeit numpy.std(range(10000), ddof=1)
%timeit numpy.std(range(100000), ddof=1)
print ""

My Times 3/2*sqrt(37037) 125 loops, best of 3: 2.66 ms per loop 25 loops, best of 3: 24.9 ms per loop 5 loops, best of 3: 269 ms per loop Timings using stats.TimeSeries 288.8194360957494 625 loops, best of 3: 22.3 µs per loop 625 loops, best of 3: 2.48 µs per loop 625 loops, best of 3: 222 µs per loop Timings using Numpy 288.81943609574938 625 loops, best of 3: 135 µs per loop 625 loops, best of 3: 993 µs per loop 25 loops, best of 3: 10.1 ms per loop
print "times in r"
#time in r
reset("r")
%r
#x <- c(1:1000)
#sd(x, na.rm = FALSE)
start.time <- Sys.time()
sd(1:1000, na.rm = FALSE)
end.time <- Sys.time()
total.time <- end.time - start.time
total.time

start.time <- Sys.time()
sd(1:10000, na.rm = FALSE)
end.time <- Sys.time()
total.time <- end.time - start.time
total.time

start.time <- Sys.time()
sd(1:100000, na.rm = FALSE)
end.time <- Sys.time()
total.time <- end.time - start.time
total.time
#system.time(a <- sd(1:1000))
#system.time(a <- sd(1:10000))
#system.time(a <- sd(1:100000))

times in r  288.8194 Time difference of 0.005102158 secs  2886.896 Time difference of 0.00421977 secs  28867.66 Time difference of 0.007123947 secs
# Test: your code should agree with the std function that is built into Sage.
std([1..10])

sqrt(55/6)
%cython
import math
from sage.all import Integer, sqrt as sqrt0   # type Integer instead of  sage.all.Integer
def mysd_cython(v):
numbs = Integer(len(v))
total = 0
for n in v:
total += n
avg = total/numbs
dev = 0
for n in v:
res = n - avg
dev += res**2
return sqrt0((dev/(numbs-1)))


Defined Integer, math, mysd_cython, sqrt0
Auto-generated code...
mysd_cython([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
%timeit mysd_cython(range(1000))
%timeit mysd_cython(range(10000))
%timeit mysd_cython(range(100000))

sqrt(55/6) 125 loops, best of 3: 2.62 ms per loop 25 loops, best of 3: 21.6 ms per loop 5 loops, best of 3: 180 ms per loop
#In conclusion, Python is the slowest, with Cython being only slightly faster and r being slightly faster than both, and numpy and stats.timeseries are much faster than the others

%md
## Problem 2: Which approach

Imagine you're an a tenure-track academic researcher.  To do some key research,
you need to implement code to compute a function $f(n)$, and use it to
evaluate $\alpha = f(1) + f(2) + \cdots + f(100)$.

You can implement this code in two ways:

**STRATEGY 1:** Work fulltime for one month implementing and debugging a very fast algorithm
in C, so that actually computing $\alpha$ only takes $1$ hour on your laptop.

**STRATEGY 2:** Spend about 3 hours writing a non-optimized Python program (that
calls some other slow code), then let your code run on a cluster
for 1 month (running the program costs \$2K in grant money). Come up with a creative (but serious) argument in favor of each of the above strategies. (How this will be graded: (a) you may lose points for mistakes in English spelling and grammar, (b) you must have put real thought into the problem.)  ## Problem 2: Which approach Imagine you're an a tenure-track academic researcher. To do some key research, you need to implement code to compute a function $f(n)$, and use it to evaluate $\alpha = f(1) + f(2) + \cdots + f(100)$. You can implement this code in two ways: STRATEGY 1: Work fulltime for one month implementing and debugging a very fast algorithm in C, so that actually computing $\alpha$ only takes $1$ hour on your laptop. STRATEGY 2: Spend about 3 hours writing a non-optimized Python program (that calls some other slow code), then let your code run on a cluster for 1 month (running the program costs$2K in grant money).

Come up with a creative (but serious) argument in favor of each of the above strategies.

(How this will be graded: (a) you may lose points for mistakes in English spelling and grammar, (b) you must have put real thought into the problem.)




%md
## Problem 3:

Given a rough order of magnitude estimate for how long each of the following should take:

1. Doing a Google search for math software (just the time that Google reports it having taken).
2. A server in Seattle pinging a server in California (both with optimal network connections).
3. A server in Seattle pinging a server in Japan  (both with optimal network connections).
4. Adding together 1 million double precision floating point numbers using pure Python.
4. Adding together 1 million double precision floating point numbers using Cython (or C) with type declarations.
5. Computing the determinant of a 100x100 matrix whose entries are random single-digit integers (using Sage).  No rounding errors.
6. 1 million separate writes of random numbers to an SSD disk.   By "separate write" we mean that you flush the write to disk every time.  You can lookup specs for number of IO operations per second for some common SSD's by searching on Google.



## Problem 3:

Given a rough order of magnitude estimate for how long each of the following should take:

1. Doing a Google search for math software (just the time that Google reports it having taken).
2. A server in Seattle pinging a server in California (both with optimal network connections).
3. A server in Seattle pinging a server in Japan (both with optimal network connections).
4. Adding together 1 million double precision floating point numbers using pure Python.
5. Adding together 1 million double precision floating point numbers using Cython (or C) with type declarations.
6. Computing the determinant of a 100x100 matrix whose entries are random single-digit integers (using Sage). No rounding errors.
7. 1 million separate writes of random numbers to an SSD disk. By "separate write" we mean that you flush the write to disk every time. You can lookup specs for number of IO operations per second for some common SSD's by searching on Google.


%cython

def getTotal(v):
cdef float total = 0.0
for n in v:
total += n

Defined getTotal
Auto-generated code...

s = set(range(1, 1000000))
%timeit getTotal(s)

25 loops, best of 3: 50.4 ms per loop

m = matrix(100, 100, 7)
m.parent()
%timeit det(m)

Full MatrixSpace of 100 by 100 dense matrices over Integer Ring 625 loops, best of 3: 238 ns per loop
%md
Problem 4: Techniques used by projects

Track down projects that use various tools that we discussed this week.

1. Give five examples of existing open source Python projects that use Cython. What do they use Cython for?

2. Give five examples of existing open source Python projects that use Numpy. What is something they use Numpy for?

3. Give three examples of existing open source Python projects that use Numba. How do they use numba?


Problem 4: Techniques used by projects

Track down projects that use various tools that we discussed this week.

1. Give five examples of existing open source Python projects that use Cython. What do they use Cython for?

2. Give five examples of existing open source Python projects that use Numpy. What is something they use Numpy for?

3. Give three examples of existing open source Python projects that use Numba. How do they use numba?

Question 1

1. Sage - Call useful methods at Cython speed (ex. Fast Hashable Class http://doc.sagemath.org/html/en/reference/misc/sage/misc/fast_methods.html)
2. Crux
3. Plasticity
4. GIPSY
5. PyYAML

These projects use Cython to get the functionality of Python and the speed of C (faster than pure Python)

Question 2

1. SciPy (https://www.scipy.org/)
2. PyTables (http://www.pytables.org/)
3. Pathomx (http://pathomx.org/)
4. NeuralTalk (https://github.com/karpathy/neuraltalk)
5. Veusz (http://freecode.com/projects/veusz)

Numpy is used for data collection of generic data that can be seamlessly utilized with many different data structures (source: http://www.numpy.org/)

Question 3

1. Numbas (https://github.com/numbas/Numbas)
2. numbagg (https://github.com/shoyer/numbagg)
3. numbavec (https://github.com/mvsaha/numbavec)

Numba is used to run mathematical processes faster than pure Python can.

## Problem 5:

1. Write a simple naive function using a for loop to compute $s(n) = 1 + 2 + 3 + \cdots + n$, the sum of the first $n$ integers. Do not use the formula $s(n)=n(n+1)/2$. Use %timeit and/or %time to see how long your function takes when $n=10^5$ and $n=10^6$.
2. Ensure that the numbers are of type Python int in part 1 (rather than Sage's Integers). How much does that improve the timings?
3. Use Numba to try to speed up your Python function and time it (put %python at the top of any cell in which you use numba). Does Numba help?
4. Compile the exact same function using Cython and time it.
5. Add Cython type declarations, so assume that everything is cdef long, and compile it.
6. Do 1-5 above again, but using the formula $s(n)=n(n+1)/2$ instead of a for loop.
def getSum(x):
total = 0
for n in range(1, x+1):
total += n

%timeit getSum(100000)
%timeit getSum(1000000)

5 loops, best of 3: 43.2 ms per loop 5 loops, best of 3: 469 ms per loop
#make numbers python int
def getSum(x):
total = int(0)
for n in range(1, x+1):
total += int(n)

%timeit getSum(100000)
%timeit getSum(1000000)

25 loops, best of 3: 16.7 ms per loop 5 loops, best of 3: 183 ms per loop
#roughly more than doubles the speed

#using numba
%python

import numba
@numba.jit
def getSum(x):
total = int(0)
for n in range(1, x+1):
total += int(n)

%timeit getSum(100000)
%timeit getSum(1000000)

5 loops, best of 3: 27.5 ms per loop 5 loops, best of 3: 329 ms per loop
#numba doesn't make it faster

#using cython
%cython
import math
def getSum(x):
total = int(0)
for n in range(1, x+1):
total += int(n)


Defined getSum, math
Auto-generated code...
%timeit getSum(100000)
%timeit getSum(1000000)

125 loops, best of 3: 4.51 ms per loop 5 loops, best of 3: 55.3 ms per loop
#adding type declarations
%cython
import math

def getSum(x):
cdef long total=0
for n in range(1, x+1):
total += int(n)


Defined getSum, math
Auto-generated code...

%timeit getSum(100000)
%timeit getSum(1000000)

125 loops, best of 3: 4.36 ms per loop 5 loops, best of 3: 56.4 ms per loop
#now not using a loop
def getSum(n):
return  n * (n+1) / 2

%timeit getSum(100000)
%timeit getSum(1000000)

625 loops, best of 3: 875 ns per loop 625 loops, best of 3: 1.02 µs per loop
#make numbers python int
def getSum(n):
return  n * (n+1) / 2

%timeit getSum(int(100000))
%timeit getSum(int(1000000))

625 loops, best of 3: 2.22 µs per loop 625 loops, best of 3: 2.2 µs per loop
#makes it slightly slower

#using numba
%python

import numba
@numba.jit
def getSum(n):
return  n * (n+1) / 2

%timeit getSum(100000)
%timeit getSum(1000000)

625 loops, best of 3: 19.5 µs per loop 625 loops, best of 3: 26.1 µs per loop
#numba made it slightly slower

#using cython
%cython
import math

def getSum(n):
return  n * (n+1) / 2

Defined getSum, math
Auto-generated code...
%timeit getSum(int(100000))
%timeit getSum(int(1000000))

625 loops, best of 3: 638 ns per loop 625 loops, best of 3: 613 ns per loop
#add type declaration
%cython
import math

def getSum(n):
cdef long total = n * (n+1) / 2
return  total

Defined getSum, math
Auto-generated code...
%timeit getSum(int(100000))
%timeit getSum(int(1000000))

625 loops, best of 3: 301 ns per loop 625 loops, best of 3: 301 ns per loop