Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Project: MAT 1630
Views: 1276
Visibility: Unlisted (only visible to those who know the link)
Image: ubuntu2004
Kernel: Python 3 (system-wide)

Notes

This notebook will contain brief notes on topics covered in the course.

Data Types, Operators, Expressions, and Comparisons

Data Types

There are many different classes of objects in Python. We will not cover all of them here. Below is a short list of some of the basic objects (datatypes) we will commonly use:

  • int: Integers like ...,-2, -1, 0, 1, 2.... There is a limit to how large these can be and you must be careful about what happens when you use division.

  • float: Floating point numbers have a decimal part like 2.1 and 3.000023. Because the computer uses binary to represent numbers, not every number can be perfectly represented by a float.

  • bool: Boolean variables are either True or False and can be used to compare other data types.

  • list: A list is an ordered collection of objects. These are typically used if you need to repeat a procedure. For example [1,2,3,5] is a list. Note: lists use square brackets and can contain different data types.

  • string: A string is essentially a list of characters. For example, 'python' is a string. Note: the int 1 is different from the char '1'.

You can figure out an objects type by using the type() function.

type(1)
int
type(3.2)
float
type(True)
bool
type([1,2,3])
list
type('3')
str

Operators and Expressions

Operators allow you to take data types and manipulate their values. The same symbol may do different things depending on the data types used as inputs. Expressions are just how we tell the computer what we would like it to do.

3 + 4
7
3 / 2
1.5
[1,2,3] + [4,5,6]
[1, 2, 3, 4, 5, 6]
'python' + ' ' + 'is great'
'python is great'
3 + 4 * 2
11

Comparisons

Comparisons are expressions that usually evaluate to True or False. We use these when writing code that reacts to a given input.

3 <= 4
True
'Hello' == 'hello'
False

Variables

Variables act as a storage or placeholder for different data types that allows us to write abstract code that can handle a unknown input values. Variable names are case sensitive and cannot begin with a number. There are a few other rules regarding variable names. As the scope of your project grows, make sure to use long, descriptive variable names to help. Note that = is different from ==. In particular, = assigns the value on the right to the variable on the left, while == returns a boolean value when comparing the objects on the right and left.

x = 2
x + 3
5
x = x + 1
x
3
y = x
y
3
y == x
True

Functions

Functions are helpful in that they allow you to write code that can be easily reused. Instead of writing the same block of code multiple times in a program (if needed), use a function. Note that in Python, whitespace is critical. Indented lines are considered to be part of the function, while lines that line up with the function definition line do not.

The basic structure is:

def function_name(input_variable1, input_variable2,...): code return output
def square(n): return n*n
square(3)
9
square(5.1)
26.009999999999998

Control Flow: If/Elif/Else and Loops

If/Elif/Else

If/Elif/Else statements are useful whenever we need a program that reacts to some input. Note the importance of whitespace here as well. For example, the absolute value function is defined as $$|x|=\begin{cases}\phantom{-}x & \text{ if $x\geq 0$}\-x & \text{ otherwise.}\end{cases}$$

The basic structure is:

if comparison: code1 else: code2

If the comparison is true, code1 executes. Otherwise, code2 executes. If more than two options are needed, use elif with additional comparisons.

if comparison1: code1 elif comparison2: code2 else: code3
x = -3 if x >= 0: print(x) else: print(-x)
3

More on Lists and Strings

Lists and strings are very similar and share a lot of similar behavior/operators. When assigning a list or string to a variable, we can extract portions of the list of string using indexing. Remember that Python is zero indexed. Negative numbers may be used to select items counting from right to left rather thant he usual left to right.

my_list = [1,5,7,3,9] my_string = 'quick'
my_list[0]
1
my_string[-1]
'k'

Lists and strings allow for slicing, which produces a copy of the list or string containing the subset of elements listed.

The usual notation is:

variable_name[start:stop + 1: step_size]

Blanks may be used for the start and stop portion to indicate that you want to start from the beginning or go to the very end. A negative step size reverses the direction of the list or string.

my_list[2:4]
[7, 3]
my_string[:3]
'qui'
my_list[::2]
[1, 7, 9]
my_list[::-1]
[9, 3, 7, 5, 1]

While Loops

While loops are useful when you need to repeat some procedure but are not sure of how many iterations will be needed. In many situations, while and for loops are interchangable.

The basic structure is:

while comparison: code

The code inside the while loop will continue to be executed until the comparison is false. That means it is up to you to update the variables associated with the comparison unless you want a loop that either never runs or runs forever.

n = 1 while n <= 10: print(n) n += 1
1 2 3 4 5 6 7 8 9 10

For Loops

A for loop is useful when you you would like to compute something repeatedly based on each element in a list, string, or other iterable.

The basic structure is:

for variable in iterable: code

If your iterable is a list of numbers or a string, then we can perform a common operation based on each individual element in the list or string.

for x in [1,2,3,4,5]: print(-x)
-1 -2 -3 -4 -5
for c in 'Hello there!': print(c)
H e l l o t h e r e !

The following example combines the length function with string indexing. It accomplishes the same thing as the previous loop.

s = 'Hello there!' for i in range(len(s)): print(s[i])
H e l l o t h e r e !
for x in my_list: print(1/x)
1.0 0.2 0.14285714285714285 0.3333333333333333 0.1111111111111111
for y in my_string: print(y.upper())
Q U I C K

Compound Interest

The basic formula for compound interest:

P=P(1+rn)+cP'=P\cdot\left(1+\frac{r}{n}\right)+c

P=P = principal

r=r = annual interest rate

n=n = number of compoundings per year

c=c = contribution for principal per compounding period

P=P' = new principal after one compounding period

Recursion

Recursion involves objects whose definitions refer to themselves.

  • Recursion is useful when solving a problem whose solution involves first solving a “smaller” version of the same problem.

  • Recursive definitions involve two key ingredients:

    1. A base case – this is an initial case or cases used to compute an answer.

    2. An inductive case – this case tells you how to reduce the current case to previous cases.

Below we give a recursive definition of the function that computes the minimum number of moves needed to solve the Tower of Hanoi problem with nn discs.

def hanoi(n): if n == 1: return 1 else: return 2*hanoi(n-1)+1 hanoi(7)
127

Basic Plotting

We will use the Matplotlib library for basic plotting. Matplotlib is not available by default in Python and must be imported. The general idea is that you issue various commands that modify a plot until you are satisfied with the results. There are a wide variety of options. The recommendation is that you use the Matplotlib website for help. Do not attempt to memorize all of Matplotlib.

import matplotlib.pyplot as plt x = [1,2,3] y1 = [3,5,2] y2 = [-3,2,5] plt.scatter(x,y1,color='red',label='y_1') plt.scatter(x,y2,label='y_2') plt.xlabel('x') plt.ylabel('y') plt.legend() plt.title('y vs. x')
Text(0.5, 1.0, 'y vs. x')
Image in a Jupyter notebook

List Comprehensions

A list comprehension if a shortcut that allows you to create lists of elements using notation similar to set-builder notation.

The basic structure is:

[expression(var) for var in iterable if condition]

This is equivalent to:

L = [] for var in iterable: if condition: L.append(expression(var))

The list comprehension below contains the set of odd cubes from 0 to 1000.

[x**3 for x in range(10) if x%2==1]
[1, 27, 125, 343, 729]

Bisection Method

We use the Bisection Method to estimate the roots of a continuous function f(x)f(x). This is done in the following way:

  1. Select a tolerance level for the error in our approximation.

  2. Find real numbers aa and bb so that f(a)f(b)<0f(a)f(b) < 0.

  3. Compute c=a+b2c=\frac{a+b}{2}.

  4. If cac-a is smaller than the tolerance level for the error, stop and return cc as an approximation of a root of ff.

  5. If f(c)=0f(c)=0, we have found a root and we stop.

  6. If f(a)f(c)>0f(a)f(c) > 0, set a=ca = c. Otherwise set b=cb=c and return to step 2.

def bisection(a,b,f,err): c = (a+b)/2 while c-a > err: if f(c) == 0: return n elif f(c)*f(a) > 0: a = c else: b = c c = (a+b)/2 return c def root2(n): return n*n-2 bisection(0,2,root2,0.0001)
1.41424560546875

Newton's Method

Newton's Method is an alternative method for finding roots of differentiable functions. While it may converge faster than the Bisection Method algorithm, it is not always guaranteed to converge. Newton's Method uses the following recurrence relation to approximate roots of f(x)f(x) xn+1=xnf(xn)f(xn).x_{n+1} = x_n-\dfrac{f(x_n)}{f'(x_n)}.

def newton(f,df,x,n,err): i = 0 while abs(f(x)) > err and i <= n: x = x - f(x)/df(x) i += 1 if i > n: return False else: return x

Random Number Generation

To generate random numbers, use np.random. The documentation covers most basic needs. Be sure to import the numpy module.

We give a few examples below.

import numpy as np

Generating 10 random coin flips.

np.random.randint(0,2,size=10)
array([0, 0, 1, 1, 0, 1, 1, 0, 1, 0])

Selecting (with replacement) from a desired list.

myList = [1,2,3,5] np.random.choice(myList,size=20)
array([5, 3, 1, 1, 3, 2, 3, 1, 3, 3, 2, 3, 5, 3, 2, 1, 2, 5, 5, 3])

Shuffling a list.

myList2 = [1,2,3,4,5] np.random.shuffle(myList2) myList2
[5, 4, 1, 3, 2]

Image Manipulation

Images are just matrices with several entries encoding the color/transparency of each invidual pixel. Below we illustrate some exaples of how to import, manipulate, and display images.

img = plt.imread('galaxy.jpg') img
array([[[10, 15, 19], [14, 19, 23], [10, 15, 21], ..., [13, 22, 27], [ 8, 18, 20], [ 9, 19, 21]], [[ 8, 13, 17], [10, 18, 21], [ 6, 13, 19], ..., [13, 22, 27], [ 8, 17, 22], [10, 19, 24]], [[ 6, 14, 17], [ 7, 15, 18], [10, 18, 21], ..., [ 7, 16, 23], [17, 26, 33], [14, 23, 30]], ..., [[ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0], ..., [ 5, 5, 5], [ 8, 8, 8], [14, 14, 14]], [[ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0], ..., [ 2, 2, 2], [ 3, 3, 3], [ 9, 9, 9]], [[ 1, 1, 1], [ 1, 1, 1], [ 1, 1, 1], ..., [ 1, 1, 1], [ 1, 1, 1], [ 3, 3, 3]]], dtype=uint8)

Each entry represents the intensity of the colors Red, Green, and Blue (RGB) in that order with a value from 0 to 255. Beware that other formats and Python modules might give different results.

Images here are numpy arrays and can be modified as such. In the example below sets the R values to zero, removing the color red from the image

imgNoRed = img.copy() imgNoRed[:,:,0] = 0
plt.imshow(imgNoRed)
<matplotlib.image.AxesImage at 0x7f159551c0d0>
Image in a Jupyter notebook

We can also change the orientation or obtain a mirror image of the image by reshaping the original matrix.

imgRev = img.copy() imgRev = img[:,::-1,:] plt.imshow(imgRev)
<matplotlib.image.AxesImage at 0x7f15954f3c10>
Image in a Jupyter notebook

We can also change individual values of the matrix to edit the image. Below we add a vertical white line to columns 400 to 499.

imgXLine = img.copy() imgXLine[:,400:500,:] = [255,255,255] plt.imshow(imgXLine)
<matplotlib.image.AxesImage at 0x7f159544e910>
Image in a Jupyter notebook

Numpy arrays allow for the creation of a mask. A mask is an array of True/False values. When modifying a matrix, the changes will only occur to the entries where the mask is True. In the example below, we create a max that is True outside a circle of radius 400 centered at the center of the image. So the mask only applies to matrix entries at least 400 pixels from the center of the image.m

lx, ly, c = img.shape mask = [[(j-ly/2)**2 + (i-lx/2)**2 > 400**2 for j in range(ly)] for i in range(lx)] imgCirc = img.copy() imgCirc[mask] = 0 plt.imshow(imgCirc)
<matplotlib.image.AxesImage at 0x7f15953bdc70>
Image in a Jupyter notebook

Pandas Basics

Pandas is a module that allows us to read Excel and csv files. Pandas allows us to manipulate dataframes (you can think of dataframes as augmented matrices with a lot of available functions and methods) using Python.

import pandas as pd

The file format determines the read command needed. Use pd.read_csv for csv files.

df = pd.read_excel('SampleData.xlsx')

The shape attribute lets you know the dataframe size. You can obtain the indices (rows) using .index and the column labels using .columns.

df.shape
(43, 7)

The head command gives you the first 5 rows and is usually a good way to check the format and that the information has been read correctly.

df.head()
OrderDate Region Rep Item Units Unit Cost Total
0 2016-01-06 East Jones Pencil 95 1.99 189.05
1 2016-01-23 Central Kivell Binder 50 19.99 999.50
2 2016-02-09 Central Jardine Pencil 36 4.99 179.64
3 2016-02-26 Central Gill Pen 27 19.99 539.73
4 2016-03-15 West Sorvino Pencil 56 2.99 167.44

Much like the mask example we saw ealier, we can obtain whatever rows satisify some condition using comparisons inside of square brackets. For multiple comparisons requiring the use of 'and' or 'or', use & and | rather than the keywords 'and' and 'or'.

df[df['Units'] > 60]
OrderDate Region Rep Item Units Unit Cost Total
0 2016-01-06 East Jones Pencil 95 1.99 189.05
6 2016-04-18 Central Andrews Pencil 75 1.99 149.25
7 2016-05-05 Central Jardine Pencil 90 4.99 449.10
10 2016-06-25 Central Morgan Pencil 90 4.99 449.10
12 2016-07-29 East Parent Binder 81 19.99 1619.19
17 2016-10-22 East Jones Pen 64 8.99 575.36
19 2016-11-25 Central Kivell Pen Set 96 4.99 479.04
20 2016-12-12 Central Smith Pencil 67 1.29 86.43
21 2016-12-29 East Parent Pen Set 74 15.99 1183.26
23 2017-02-01 Central Smith Binder 87 15.00 1305.00
27 2017-04-10 Central Andrews Pencil 66 1.99 131.34
28 2017-04-27 East Howard Pen 96 4.99 479.04
30 2017-05-31 Central Gill Binder 80 8.99 719.20
32 2017-07-04 East Jones Pen Set 62 4.99 309.38
37 2017-09-27 West Sorvino Pen 76 1.99 151.24
41 2017-12-04 Central Jardine Binder 94 19.99 1879.06
df[(df['Units'] > 60) & (df['Region']=='Central')]
OrderDate Region Rep Item Units Unit Cost Total
6 2016-04-18 Central Andrews Pencil 75 1.99 149.25
7 2016-05-05 Central Jardine Pencil 90 4.99 449.10
10 2016-06-25 Central Morgan Pencil 90 4.99 449.10
19 2016-11-25 Central Kivell Pen Set 96 4.99 479.04
20 2016-12-12 Central Smith Pencil 67 1.29 86.43
23 2017-02-01 Central Smith Binder 87 15.00 1305.00
27 2017-04-10 Central Andrews Pencil 66 1.99 131.34
30 2017-05-31 Central Gill Binder 80 8.99 719.20
41 2017-12-04 Central Jardine Binder 94 19.99 1879.06

There are several useful methods for dataframes, see the documentation for more details. Here, value_counts() gives the number of unique entries in a column together with the number of instances of each.

df['Rep'].value_counts()
Jones 8 Gill 5 Jardine 5 Kivell 4 Andrews 4 Sorvino 4 Parent 3 Smith 3 Morgan 3 Howard 2 Thompson 2 Name: Rep, dtype: int64

The .loc method allows us to obtain slices of the dataframe. This is similar to slicing numpy arrays.

df.loc[38:,['Item','Units']]
Item Units
38 Binder 57
39 Pencil 14
40 Binder 11
41 Binder 94
42 Binder 28

We can create new columns by simply defining them in terms of some formula.

df['Tax'] = df.Total*0.08

After adding a column, we do a quick check to see if things have worked.

df.head()
OrderDate Region Rep Item Units Unit Cost Total Tax
0 2016-01-06 East Jones Pencil 95 1.99 189.05 15.1240
1 2016-01-23 Central Kivell Binder 50 19.99 999.50 79.9600
2 2016-02-09 Central Jardine Pencil 36 4.99 179.64 14.3712
3 2016-02-26 Central Gill Pen 27 19.99 539.73 43.1784
4 2016-03-15 West Sorvino Pencil 56 2.99 167.44 13.3952