| Download

Project: MAT 1630

Path: Notes.ipynb

Views: ¹²⁷⁶

Visibility: Unlisted (only visible to those who know the link)

Image: ubuntu2004

Kernel: Python 3 (system-wide)

Notes

This notebook will contain brief notes on topics covered in the course.

Data Types, Operators, Expressions, and Comparisons

Data Types

There are many different classes of objects in Python. We will not cover all of them here. Below is a short list of some of the basic objects (datatypes) we will commonly use:

int: Integers like ...,-2, -1, 0, 1, 2.... There is a limit to how large these can be and you must be careful about what happens when you use division.
float: Floating point numbers have a decimal part like 2.1 and 3.000023. Because the computer uses binary to represent numbers, not every number can be perfectly represented by a float.
bool: Boolean variables are either True or False and can be used to compare other data types.
list: A list is an ordered collection of objects. These are typically used if you need to repeat a procedure. For example [1,2,3,5] is a list. Note: lists use square brackets and can contain different data types.
string: A string is essentially a list of characters. For example, 'python' is a string. Note: the int 1 is different from the char '1'.

You can figure out an objects type by using the type() function.

In [1]:

type(1)

int

In [2]:

type(3.2)

float

In [3]:

type(True)

bool

In [4]:

type([1,2,3])

list

In [5]:

type('3')

str

Operators and Expressions

Operators allow you to take data types and manipulate their values. The same symbol may do different things depending on the data types used as inputs. Expressions are just how we tell the computer what we would like it to do.

In [6]:

3 + 4

7

In [7]:

3 / 2

1.5

In [8]:

[1,2,3] + [4,5,6]

[1, 2, 3, 4, 5, 6]

In [9]:

'python' + ' ' + 'is great'

'python is great'

In [10]:

3 + 4 * 2

11

Comparisons

Comparisons are expressions that usually evaluate to True or False. We use these when writing code that reacts to a given input.

In [11]:

3 <= 4

True

In [12]:

'Hello' == 'hello'

False

Variables

Variables act as a storage or placeholder for different data types that allows us to write abstract code that can handle a unknown input values. Variable names are case sensitive and cannot begin with a number. There are a few other rules regarding variable names. As the scope of your project grows, make sure to use long, descriptive variable names to help. Note that = is different from ==. In particular, = assigns the value on the right to the variable on the left, while == returns a boolean value when comparing the objects on the right and left.

In [13]:

x = 2

In [14]:

x + 3

5

In [15]:

x = x + 1

In [16]:

3

In [17]:

y = x

In [18]:

3

In [19]:

y == x

True

Functions

Functions are helpful in that they allow you to write code that can be easily reused. Instead of writing the same block of code multiple times in a program (if needed), use a function. Note that in Python, whitespace is critical. Indented lines are considered to be part of the function, while lines that line up with the function definition line do not.

The basic structure is:

def function_name(input_variable1, input_variable2,...):
    code
    return output

In [20]:

def square(n):
    return n*n

In [21]:

square(3)

9

In [22]:

square(5.1)

26.009999999999998

Control Flow: If/Elif/Else and Loops

If/Elif/Else

If/Elif/Else statements are useful whenever we need a program that reacts to some input. Note the importance of whitespace here as well. For example, the absolute value function is defined as $$|x|=\begin{cases}\phantom{-}x & \text{ if $x\geq 0$}\-x & \text{ otherwise.}\end{cases}$$

The basic structure is:

if comparison:
    code1
else:
    code2

If the comparison is true, code1 executes. Otherwise, code2 executes. If more than two options are needed, use elif with additional comparisons.

if comparison1:
    code1
elif comparison2:
    code2
else:
    code3

In [23]:

x = -3

if x >= 0:
    print(x)
else:
    print(-x)

3

More on Lists and Strings

Lists and strings are very similar and share a lot of similar behavior/operators. When assigning a list or string to a variable, we can extract portions of the list of string using indexing. Remember that Python is zero indexed. Negative numbers may be used to select items counting from right to left rather thant he usual left to right.

In [24]:

my_list = [1,5,7,3,9]
my_string = 'quick'

In [25]:

my_list[0]

1

In [26]:

my_string[-1]

'k'

Lists and strings allow for slicing, which produces a copy of the list or string containing the subset of elements listed.

The usual notation is:

variable_name[start:stop + 1: step_size]

Blanks may be used for the start and stop portion to indicate that you want to start from the beginning or go to the very end. A negative step size reverses the direction of the list or string.

In [27]:

my_list[2:4]

[7, 3]

In [28]:

my_string[:3]

'qui'

In [29]:

my_list[::2]

[1, 7, 9]

In [30]:

my_list[::-1]

[9, 3, 7, 5, 1]

While Loops

While loops are useful when you need to repeat some procedure but are not sure of how many iterations will be needed. In many situations, while and for loops are interchangable.

The basic structure is:

while comparison:
    code

The code inside the while loop will continue to be executed until the comparison is false. That means it is up to you to update the variables associated with the comparison unless you want a loop that either never runs or runs forever.

In [31]:

n = 1
while n <= 10:
    print(n)
    n += 1

For Loops

A for loop is useful when you you would like to compute something repeatedly based on each element in a list, string, or other iterable.

The basic structure is:

for variable in iterable:
    code

If your iterable is a list of numbers or a string, then we can perform a common operation based on each individual element in the list or string.

In [32]:

for x in [1,2,3,4,5]:
    print(-x)

-1
-2
-3
-4
-5

In [33]:

for c in 'Hello there!':
    print(c)

H
e
l
l
o
 
t
h
e
r
e
!

The following example combines the length function with string indexing. It accomplishes the same thing as the previous loop.

In [34]:

s = 'Hello there!'
for i in range(len(s)):
    print(s[i])

H
e
l
l
o
 
t
h
e
r
e
!

In [35]:

for x in my_list:
    print(1/x)

0
2
14285714285714285
3333333333333333
1111111111111111

In [36]:

for y in my_string:
    print(y.upper())

Q
U
I
C
K

Compound Interest

The basic formula for compound interest:

P'=P\cdot\left(1+\frac{r}{n}\right)+c

$P =$ principal

$r =$ annual interest rate

$n =$ number of compoundings per year

$c =$ contribution for principal per compounding period

$P' =$ new principal after one compounding period

Recursion

Recursion involves objects whose definitions refer to themselves.

Recursion is useful when solving a problem whose solution involves first solving a “smaller” version of the same problem.
Recursive definitions involve two key ingredients:
1. A base case – this is an initial case or cases used to compute an answer.
2. An inductive case – this case tells you how to reduce the current case to previous cases.

Below we give a recursive definition of the function that computes the minimum number of moves needed to solve the Tower of Hanoi problem with $n$ discs.

In [37]:

def hanoi(n):
    if n == 1:
        return 1
    else:
        return 2*hanoi(n-1)+1

hanoi(7)

127

Basic Plotting

We will use the Matplotlib library for basic plotting. Matplotlib is not available by default in Python and must be imported. The general idea is that you issue various commands that modify a plot until you are satisfied with the results. There are a wide variety of options. The recommendation is that you use the Matplotlib website for help. Do not attempt to memorize all of Matplotlib.

In [38]:

import matplotlib.pyplot as plt

x = [1,2,3]
y1 = [3,5,2]
y2 = [-3,2,5]

plt.scatter(x,y1,color='red',label='y_1')
plt.scatter(x,y2,label='y_2')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title('y vs. x')

Text(0.5, 1.0, 'y vs. x')

List Comprehensions

A list comprehension if a shortcut that allows you to create lists of elements using notation similar to set-builder notation.

The basic structure is:

[expression(var) for var in iterable if condition]

This is equivalent to:

L = []
for var in iterable:
    if condition:
        L.append(expression(var))

The list comprehension below contains the set of odd cubes from 0 to 1000.

In [39]:

[x**3 for x in range(10) if x%2==1]

[1, 27, 125, 343, 729]

Bisection Method

We use the Bisection Method to estimate the roots of a continuous function $f(x)$ . This is done in the following way:

Select a tolerance level for the error in our approximation.
Find real numbers $a$ and $b$ so that $f(a)f(b) < 0$ .
Compute $c=\frac{a+b}{2}$ .
If $c-a$ is smaller than the tolerance level for the error, stop and return $c$ as an approximation of a root of $f$ .
If $f(c)=0$ , we have found a root and we stop.
If $f(a)f(c) > 0$ , set $a = c$ . Otherwise set $b=c$ and return to step 2.

In [40]:

def bisection(a,b,f,err):
    c = (a+b)/2
    while c-a > err:
        if f(c) == 0:
            return n
        elif f(c)*f(a) > 0:
            a = c
        else:
            b = c
        c = (a+b)/2
    return c

def root2(n):
    return n*n-2

bisection(0,2,root2,0.0001)

1.41424560546875

Newton's Method

Newton's Method is an alternative method for finding roots of differentiable functions. While it may converge faster than the Bisection Method algorithm, it is not always guaranteed to converge. Newton's Method uses the following recurrence relation to approximate roots of $f(x)$ $x_{n+1} = x_n-\dfrac{f(x_n)}{f'(x_n)}.$

In [41]:

def newton(f,df,x,n,err):
    i = 0
    while abs(f(x)) > err and i <= n:
        x = x - f(x)/df(x)
        i += 1
    if i > n:
        return False
    else:
        return x

Random Number Generation

To generate random numbers, use np.random. The documentation covers most basic needs. Be sure to import the numpy module.

We give a few examples below.

In [42]:

import numpy as np

Generating 10 random coin flips.

In [43]:

np.random.randint(0,2,size=10)

array([0, 0, 1, 1, 0, 1, 1, 0, 1, 0])

Selecting (with replacement) from a desired list.

In [44]:

myList = [1,2,3,5]
np.random.choice(myList,size=20)

array([5, 3, 1, 1, 3, 2, 3, 1, 3, 3, 2, 3, 5, 3, 2, 1, 2, 5, 5, 3])

Shuffling a list.

In [45]:

myList2 = [1,2,3,4,5]
np.random.shuffle(myList2)
myList2

[5, 4, 1, 3, 2]

Image Manipulation

Images are just matrices with several entries encoding the color/transparency of each invidual pixel. Below we illustrate some exaples of how to import, manipulate, and display images.

In [46]:

img = plt.imread('galaxy.jpg')

img

array([[[10, 15, 19],
        [14, 19, 23],
        [10, 15, 21],
        ...,
        [13, 22, 27],
        [ 8, 18, 20],
        [ 9, 19, 21]],

       [[ 8, 13, 17],
        [10, 18, 21],
        [ 6, 13, 19],
        ...,
        [13, 22, 27],
        [ 8, 17, 22],
        [10, 19, 24]],

       [[ 6, 14, 17],
        [ 7, 15, 18],
        [10, 18, 21],
        ...,
        [ 7, 16, 23],
        [17, 26, 33],
        [14, 23, 30]],

       ...,

       [[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        ...,
        [ 5,  5,  5],
        [ 8,  8,  8],
        [14, 14, 14]],

       [[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        ...,
        [ 2,  2,  2],
        [ 3,  3,  3],
        [ 9,  9,  9]],

       [[ 1,  1,  1],
        [ 1,  1,  1],
        [ 1,  1,  1],
        ...,
        [ 1,  1,  1],
        [ 1,  1,  1],
        [ 3,  3,  3]]], dtype=uint8)

Each entry represents the intensity of the colors Red, Green, and Blue (RGB) in that order with a value from 0 to 255. Beware that other formats and Python modules might give different results.

Images here are numpy arrays and can be modified as such. In the example below sets the R values to zero, removing the color red from the image

In [47]:

imgNoRed = img.copy()
imgNoRed[:,:,0] = 0

In [48]:

plt.imshow(imgNoRed)

<matplotlib.image.AxesImage at 0x7f159551c0d0>

We can also change the orientation or obtain a mirror image of the image by reshaping the original matrix.

In [49]:

imgRev = img.copy()
imgRev = img[:,::-1,:]
plt.imshow(imgRev)

<matplotlib.image.AxesImage at 0x7f15954f3c10>

We can also change individual values of the matrix to edit the image. Below we add a vertical white line to columns 400 to 499.

In [50]:

imgXLine = img.copy()
imgXLine[:,400:500,:] = [255,255,255]
plt.imshow(imgXLine)

<matplotlib.image.AxesImage at 0x7f159544e910>

Numpy arrays allow for the creation of a mask. A mask is an array of True/False values. When modifying a matrix, the changes will only occur to the entries where the mask is True. In the example below, we create a max that is True outside a circle of radius 400 centered at the center of the image. So the mask only applies to matrix entries at least 400 pixels from the center of the image.m

In [51]:

lx, ly, c = img.shape

mask = [[(j-ly/2)**2 + (i-lx/2)**2 > 400**2 for j in range(ly)] for i in range(lx)]

imgCirc = img.copy()
imgCirc[mask] = 0
plt.imshow(imgCirc)

<matplotlib.image.AxesImage at 0x7f15953bdc70>

Pandas Basics

Pandas is a module that allows us to read Excel and csv files. Pandas allows us to manipulate dataframes (you can think of dataframes as augmented matrices with a lot of available functions and methods) using Python.

In [52]:

import pandas as pd

The file format determines the read command needed. Use pd.read_csv for csv files.

In [53]:

df = pd.read_excel('SampleData.xlsx')

The shape attribute lets you know the dataframe size. You can obtain the indices (rows) using .index and the column labels using .columns.

In [54]:

df.shape

(43, 7)

The head command gives you the first 5 rows and is usually a good way to check the format and that the information has been read correctly.

In [55]:

df.head()

	OrderDate	Region	Rep	Item	Units	Unit Cost	Total
0	2016-01-06	East	Jones	Pencil	95	1.99	189.05
1	2016-01-23	Central	Kivell	Binder	50	19.99	999.50
2	2016-02-09	Central	Jardine	Pencil	36	4.99	179.64
3	2016-02-26	Central	Gill	Pen	27	19.99	539.73
4	2016-03-15	West	Sorvino	Pencil	56	2.99	167.44

Much like the mask example we saw ealier, we can obtain whatever rows satisify some condition using comparisons inside of square brackets. For multiple comparisons requiring the use of 'and' or 'or', use & and | rather than the keywords 'and' and 'or'.

In [56]:

df[df['Units'] > 60]

	OrderDate	Region	Rep	Item	Units	Unit Cost	Total
0	2016-01-06	East	Jones	Pencil	95	1.99	189.05
6	2016-04-18	Central	Andrews	Pencil	75	1.99	149.25
7	2016-05-05	Central	Jardine	Pencil	90	4.99	449.10
10	2016-06-25	Central	Morgan	Pencil	90	4.99	449.10
12	2016-07-29	East	Parent	Binder	81	19.99	1619.19
17	2016-10-22	East	Jones	Pen	64	8.99	575.36
19	2016-11-25	Central	Kivell	Pen Set	96	4.99	479.04
20	2016-12-12	Central	Smith	Pencil	67	1.29	86.43
21	2016-12-29	East	Parent	Pen Set	74	15.99	1183.26
23	2017-02-01	Central	Smith	Binder	87	15.00	1305.00
27	2017-04-10	Central	Andrews	Pencil	66	1.99	131.34
28	2017-04-27	East	Howard	Pen	96	4.99	479.04
30	2017-05-31	Central	Gill	Binder	80	8.99	719.20
32	2017-07-04	East	Jones	Pen Set	62	4.99	309.38
37	2017-09-27	West	Sorvino	Pen	76	1.99	151.24
41	2017-12-04	Central	Jardine	Binder	94	19.99	1879.06

In [57]:

df[(df['Units'] > 60) & (df['Region']=='Central')]

	OrderDate	Region	Rep	Item	Units	Unit Cost	Total
6	2016-04-18	Central	Andrews	Pencil	75	1.99	149.25
7	2016-05-05	Central	Jardine	Pencil	90	4.99	449.10
10	2016-06-25	Central	Morgan	Pencil	90	4.99	449.10
19	2016-11-25	Central	Kivell	Pen Set	96	4.99	479.04
20	2016-12-12	Central	Smith	Pencil	67	1.29	86.43
23	2017-02-01	Central	Smith	Binder	87	15.00	1305.00
27	2017-04-10	Central	Andrews	Pencil	66	1.99	131.34
30	2017-05-31	Central	Gill	Binder	80	8.99	719.20
41	2017-12-04	Central	Jardine	Binder	94	19.99	1879.06

There are several useful methods for dataframes, see the documentation for more details. Here, value_counts() gives the number of unique entries in a column together with the number of instances of each.

In [58]:

df['Rep'].value_counts()

Jones       8
Gill        5
Jardine     5
Kivell      4
Andrews     4
Sorvino     4
Parent      3
Smith       3
Morgan      3
Howard      2
Thompson    2
Name: Rep, dtype: int64

The .loc method allows us to obtain slices of the dataframe. This is similar to slicing numpy arrays.

In [59]:

df.loc[38:,['Item','Units']]

	Item	Units
38	Binder	57
39	Pencil	14
40	Binder	11
41	Binder	94
42	Binder	28

We can create new columns by simply defining them in terms of some formula.

In [60]:

df['Tax'] = df.Total*0.08

After adding a column, we do a quick check to see if things have worked.

In [61]:

df.head()

	OrderDate	Region	Rep	Item	Units	Unit Cost	Total	Tax
0	2016-01-06	East	Jones	Pencil	95	1.99	189.05	15.1240
1	2016-01-23	Central	Kivell	Binder	50	19.99	999.50	79.9600
2	2016-02-09	Central	Jardine	Pencil	36	4.99	179.64	14.3712
3	2016-02-26	Central	Gill	Pen	27	19.99	539.73	43.1784
4	2016-03-15	West	Sorvino	Pencil	56	2.99	167.44	13.3952