During this lesson, you will learn the following:
A list is a mutable sequence of values.
Flexible arrays, not linked lists
Lists use bracket notation: []
Lists are ordered, indexed by integer offset from start
Lists can have mixed types
Indexing, slicing, concatenation (+), repetition (*), len()
all work the same
But lists are objects and have their own methods...
Documentation for all list methods here: https://docs.python.org/3/tutorial/datastructures.html
.a=1
b=2
x=a,b
x
(1, 2)
my_list = ['alpha', 1, 'bravo', 2.0, [1, 2, 3]] # note the list within a list as element 4
my_list
['alpha', 1, 'bravo', 2.0, [1, 2, 3]]
type(my_list)
list
my_list[0:2]
['alpha', 1]
my_list[4][1]
2
Another example
lunch_list = ['cheese', 'apples', 'crackers']
lunch_list
['cheese', 'apples', 'crackers']
lunch_list[1] = 'oranges' # lists are mutable
lunch_list
['cheese', 'oranges', 'crackers']
lunch_list[3]
= 'pears' # you cannot assign to the end of a list
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-10-a137d2b1be82> in <module> ----> 1 lunch_list[3] = 'pears' # you cannot assign to the end of a list IndexError: list assignment index out of range
lunch_list.append('pears') # append() adds to the end
lunch_list
['pears', 'oranges', 'cheese', 'pears']
lunch_list.append("a" )
lunch_list.append('soda')
lunch_list
['cheese', 'oranges', 'crackers', 'pears', 'soda']
removed_item = lunch_list.pop() # pop() removes last item and returns it
print(removed_item)
print(lunch_list)
soda ['cheese', 'oranges', 'crackers', 'pears']
lunch_list.sort() # sorts the items "in place" (i.e., modifies the list)
lunch_list
['cheese', 'crackers', 'oranges', 'pears']
lunch_list.reverse() # reverses the items in place (i.e., modifies the list)
lunch_list
['pears', 'oranges', 'crackers', 'cheese']
lunch_list.remove('crackers')
lunch_list
['pears', 'oranges', 'cheese']
range()
function¶Python has a built-in function called range()
that is often used to quickly create a sequence of values over which to iterate.
range()
is an iterator and therefore CANNOT be subdivided decimally. e.g. range(0,10,0.01) -> error
In Python3, the function range()
returns an object (a generator) that allows you to iteratively get the next element in the sequence.
There are three ways of creating a range:
range(stop)
-- a sequence of values that starts at zero and goes up to (but not including) stop
range(start,stop)
-- a sequence of values that starts at start
and goes up to (but not including) stop
range(start,stop,step)
-- a sequence of values that starts at start
and goes up to (but not including) stop
, incrementing values according to step
As documented here, the advantage of the range type over a regular tuple (or list) is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).
list(range(5))
[0, 1, 2, 3, 4]
range(5)
range(0, 5)
for i in range(5): # this is an indexed for loop, not very Pythonic
print(i)
for i in range(3,7):
print(i)
for i in range(3,18,4):
print(i)
for i in range(0,10,0.1): # What will happen here and why?
print(i)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-21-0fbd75308984> in <module> ----> 1 for i in range(0,10,0.1): # What will happen here and why? 2 print(i) TypeError: 'float' object cannot be interpreted as an integer
The range()
function is often combined with the len()
function to iterate over the elements of an ordered sequesnce, with the following pattern...
# imagine you want to create a new list of all the "squares" of these values
L = [-12, -5, -2, 0, 3, 4, 7, 12, 15]
# We begin coding this the BAD "Matlab" way (sorry Gary and Jessie :)). The good Matlab way is squares = L.^2:
# (1) create an empty list
# (2) append the squares value
# Note: there is no index, we are using L as an enumerator, which is very Pythonic.
# However, this solution for creating a list lacks elegance.
squares = []
for x in L:
squares.append(x**2)
squares
[144, 25, 4, 0, 9, 16, 49, 144, 225]
This is a very Matlab way of doing things, but there is another way which is more Pythonic...
[ f(x) for x in <list> ]
Note that x
does not have to be an indexed iterator (e.g. i being used as a typical iterator), rather <list>
is a list object that the enuerator for the for
loop and there is no index. This is very Pythonic and we'll see many examples.
Official documentation available from: https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
# Let's begin with something very simple
easy = [x for x in range(10)]
easy
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
easy = [ x**2 for x in range(50) if(x>2 and x<5) ]
easy
[9, 16]
names = [ 'gary', 'bill', 'jesse']
[ len(n ) for n in names]
[4, 4, 5]
# Let's go up one step...
# [expression(variable) for variable in <list>]
tmp = [float(x) for x in range(10)]
tmp
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
[-12, -5, -2, 0, 3, 4, 7, 12, 15]
# We can use lists as our iterators too
squares = [ x**2 for x in L ] # the expression is multiple elements by themselves;
# x is used as a variable name for the elements
squares
[144, 25, 4, 0, 9, 16, 49, 144, 225]
[x for x in range(41) if(x %3 ==0)]
[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39]
Fibonacci Sequence as an equation: $int \left( \left[ \left(\frac{1+\sqrt{5}}{2}\right)^n - \left(\frac{1-\sqrt{5}}{2}\right)^n \right]/\sqrt{5} \right)$
Where n are integer values.
Hint: writing the function as a string and then passing it to eval('string')
will make this easier.
eqn = "int( ( ((1+5**.5)/2)**n - ((1-5**.5)/2)**n)/5**.5 )"
List comprehension can also accept conditional statements that allow us to create more complicated lists.
Syntax for a simple conditional
[f(x) for x in <list> if <condition>]
# A simple if statement
tmp = [x**2 for x in range(10) if x<5]
tmp
evens = [x for x in L if x % 2 == 0] # we use a boolean expression as the condition to get all even numbers
evens
negatives = [x for x in L if (x < 0)] # we use a boolean expression as the condition to get all negative numbers
negatives
[ x**2 + 4*x + 16 for x in range (10) if(x % 2 ==0)]
[16, 28, 48, 76, 112]
[ str(y) for y in [ x**2 for x in range(11) ] if y%2 ==0]
['0', '4', '16', '36', '64', '100']
List comprehension also accpets if/else statements but the sytax changes.
Syntax for a complex conditional is a little different:
[f(x) if <condition> else f`(x) for x in list]
# An if/else statement
[x*2 if x <= 10 else x/2 for x in range(15) ]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 5.5, 6.0, 6.5, 7.0]
[ x if x%2 ==1 else -x for x in range(20)]
[0, 1, -2, 3, -4, 5, -6, 7, -8, 9, -10, 11, -12, 13, -14, 15, -16, 17, -18, 19]
Lists are a "reference" data type, which are different from the basic types like float, int, string.
A list variable name is a reference that "points to" an address in memory that holds the list items/elements
a = [3, 7, 1, 2, -1, 9]
a
[3, 7, 1, 2, -1, 9]
id(a) # returns the reference itself
140332748575680
x = [1, 2, 3]
print(x)
id(x)
[1, 2, 3]
140332748616512
y = [4, 5]
print(y)
id(y)
[4, 5]
140332749396800
y = x
print(y)
id(y)
[1, 2, 3]
140332748616512
# now variables x and y refer to the same list!
y.append(9)
print(y)
[1, 2, 3, 9, 9]
print(x)
[1, 2, 3, 9, 9]
Many Matlab users will find this behavior to be very annoying, but this is how many other programming languages work.
Python "thinks" of variables like stickers on boxes in a warhouse. If you slap an "x" sticker on a box and then put a "y" sticker on the same box, when you change "x", you will change "y" because both stickers are on the same box.
So how do you copy a variable? There are two ways:
a = b[:] # have we talked about slicing yet?
or
import copy # We'll talk about this one later
a = copy.copy(b)
a = [1,2,3]
print(id(a))
b=a[:] # tells python to assign the elemets of 'a' to a new element 'b'
print(id(b))
A two-dimensional (nested) list is a list of references to other lists:
list_name[row][column]
data = [ [1.5, 2.5, 3.5, 3.1],
[5.1, 2.2, 2.1, 4.4],
[4.3, 0.5, 7.2, 3.9] ]
data # to see how it prints
[[1.5, 2.5, 3.5, 3.1], [5.1, 2.2, 2.1, 4.4], [4.3, 0.5, 7.2, 3.9]]
for row in data: # will print one "row" at a time
print(row)
[1.5, 2.5, 3.5, 3.1] [5.1, 2.2, 2.1, 4.4] [4.3, 0.5, 7.2, 3.9]
data[1]
[5.1, 2.2, 2.1, 4.4]
len(data[1])
4
data[1][0] # access second row, first element - remember index begins at 0
5.1
data[0][2] # access first row, third element - remember index begins at 0
3.5
data[2][3]
3.9
cube = [ [ [1, 2], [3, 4] ],
[ [5, 6], [7, 8] ] ]
for i in range(len(cube)):
for j in range(len(cube[i])):
for k in range(len(cube[i][j])):
print('(%d,%d,%d) = %d' % (i,j,k,cube[i][j][k]))
(0,0,0) = 1 (0,0,1) = 2 (0,1,0) = 3 (0,1,1) = 4 (1,0,0) = 5 (1,0,1) = 6 (1,1,0) = 7 (1,1,1) = 8
The syntax is fairly straightforward, just nest one list inside of another.
[[f(x) for x in <list>] for y in <list>]
[[x for x in range(3)] for y in range(4)]
[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]
[[outer for inner in range(3)] for outer in range(4) ]
[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]]
[[0,1,2...]
[1,2,3...],
[2,3,4...],
...
]
i.e.
[[1,2,3...],[11,12,13...]]
Hint: I used the step size in range to do this.
[[],[0],[0,1],[0,1,2],...]
I choose this example to show you that lists do not have to be square like matricies.
Using a for
loop and list comprehension, write python code to complete the triangle
pascal = [[1],[1,1]] # Start with this and build the rest of pascal's triangle
# I am helping you out with this example, you just need to add a little bit of code
for row in range(5):
l = pascal[-1] # take the last line of pascal
next_row = [1] + ...Enter your code here... + [1]
pascal.append(next_row)
Hint: I used an if/else statement
We recall that the syntax to create a nested list was:
[[<function> for inner in <list>] for outer in <list>]
To parse a nested list, the syntax changes a little.
[[<lst[outer][inner]> for outer in <list>] for inner in <list>]
Let's try to make some sense of this by simply copying the output of a matrix.
lst = lst = [[0,1],[2,3],[4,5],[6,7],[8,9]]
[[lst[y][x] for x in range(len(lst[0]))] for y in range(len(lst)) ]
Let's continue using the non-pythonic syntax we used in the nested list example above (i.e. lst[y][x]
) and let's add the parsing conditionals from before.
We start with the same syntax as before for parsing a nested list:
[
[<lst[outer][inner]> for outer in <list>]
for inner in <list>
]
But now we add a simple conditional statement.
[
[<lst[outer][inner]> for outer in <list> if <condition>]
for inner in <list>
]
# So let's try to parse this with a simple if statement
# this is same code as above but with an if statement, notice the position of the statement
[[lst[y][x] for x in range(len(lst[0])) if lst[y][x] < 3] for y in range(len(lst)) ]
[[<lst[outer][inner]> if <condition> else <function> for outer in <list>] for inner in <list>]
# So let's try to parse this with a simple if/else statement
# this is same code as above but with an if/else statement, notice the position of the statement
[[lst[y][x] if lst[y][x] < 3 else 'error' for x in range(len(lst[0]))] for y in range(len(lst)) ]
i.e.
[[1,2,3...],[6,7,8...]]
Then parse the matrix and return a 1 if the value is even
i.e.
[[1,2,3],[4,5,6],[7,8,9]]
Square each value of the matrix.
Then parse the matrix and return "even" if the value is even and "odd" if it is odd
Suppose we have a large multi-dimensional matrix and we want to return a list of values that matches our specific criteria.
Traditional programming to do this would look like this:
lst = [[[...],[...]],[[...],[...]],[[...],[...]],[[...],[...]]]
new_lst = []
for 1st_dim in range(len(lst)):
for 2nd_dim in range(len(lst)):
for 3rd_dim in range(len(lst)):
if lst[1st_dim][2nd_dim][3rd_dim] <condition>:
<function>
else:
<function>
lst = [[[1,2,3],[1,2,3]],[[2,3,4],[2,3,4]],[[4,5,6],[4,5,6]]]
new_lst = []
for one_dim in range(len(lst)):
for two_dim in range(len(lst[1])):
for three_dim in range(len(lst[1][1])):
if lst[one_dim][two_dim][three_dim] < 4:
new_lst.append(1)
else:
new_lst.append(0)
new_lst
In Python, we can simple add for loops onto each other interate through higher dimensions. You can put the new for loops on new lines to make the code easier to read.
new_lst = [
<function>
for 1st_dim in <list>
for 2nd_dim in <list>
for 3rd_dim in <list>
if <condition>
]
new_lst = [
<function> if <condition> else <function>
for 1st_dim in <list>
for 2nd_dim in <list>
for 3rd_dim in <list>
...
]
# Example of if conditional parsing to a single dimensional list
new_lst = [
1
for one_dim in range(len(lst)) # I am hard coding each dimension because this is square and to make this easier
for two_dim in range(len(lst[0]))
for three_dim in range(len(lst[0][0]))
if lst[one_dim][two_dim][three_dim] < 4
]
new_lst
# Example of if/else conditional parsing to a single dimensional list
new_lst = [
1 if lst[one_dim][two_dim][three_dim] < 4 else 0
for one_dim in range(len(lst)) # I am hard coding each dimension because this is square and to make this easier
for two_dim in range(len(lst[0]))
for three_dim in range(len(lst[0][0]))
]
new_lst
Let's make this a little more robust to handle lists that might not be square. The way to do this is to reference the iterator of the previous dimension rather than hard coding in a number. This changes the way the code looks a little.
new_lst = [
<function>
for 1st_dim in lst # 1st_dim is each first level list in lst
for 2nd_dim in 1st_dim # 2nd_dim is each second level list in each 1st level list
for 3rd_dim in 2nd_dim # 3rd_dim is each thrid level list/element in each 2nd level list
...
if <condition>
]
Not only with this code handle non-square matricies, it also doesn't rely on writing "fortran" code (e.g. range(len(lst)) which is very long and usually clutters your code with lots of brackets.
lst = [
[[1,2,3],[1,1,2,3]],
[[2,4],[1,1,1,1,2,3,4]],
[[4,5,6],[4,5,6,8]]
] # not square
new_lst = [
1
for one_dim in lst #
for two_dim in one_dim
for three_dim in two_dim
if three_dim < 4 #three_dim is an element now, I don't have to slice like before
]
new_lst
A tuple is a sequence of values, separated by commas
support mixed types (e.g., int, float, String, other tuples)
Ordered (indexed)
Immutable
Tuples are typically used by programmers to create "stable" global variables.
t = 'alpha', 'bravo', 'delta', 'charlie'
t
type(t)
It is common (but not required) to enclose tuples in parenthesis (makes it easier to ID tuples in action)
t = ('alpha', 'bravo', 'charlie', 'delta')
t
t1 = ('alpha') # not a tuple
print(t1)
type(t1)
t2 = ('alpha',) # this is a tuple - note comma
print(t2)
type(t2)
for
-loops work as with Stringst
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-66-34fc7a11cb38> in <module> ----> 1 t NameError: name 't' is not defined
t[0]
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-67-a471f56f22a6> in <module> ----> 1 t[0] NameError: name 't' is not defined
t[-1]
t[1:3] # slice the tuple
t[:3]
t[1:]
len(t) # find length
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-68-c8d8598d3f26> in <module> ----> 1 len(t) # find length NameError: name 't' is not defined
for item in t:
print(item)
t[2] = 'echo'
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-69-14bdd1b30066> in <module> ----> 1 t[2] = 'echo' NameError: name 't' is not defined
Tuples can store mixed data types
You can store a tuple as an element within a tuple.
You can iterate the elements of a tuple using either for
loop.
m = ('alpha', 123, 3.7, False, (1,2,3)) # note there is a tuple within a tuple
m
('alpha', 123, 3.7, False, (1, 2, 3))
for item in m:
print(item, type(item))
m[4]
m[4][1] # to access elements of 'inner' tuple
You can assign multiple values with tuples...
x, y, z = 1, 2, 3
x, y, z # display these values
(1, 2, 3)
...which is very useful, especially for swapping values:
x, y, z = y, z, x
x, y, z # display these values
(2, 3, 1)
The following operation is known as "packing" a tuple (we "pack" the values in a tuple container):
x = (1, 2, 3)
x
type(x)
The opposite operation is called "unpacking" a tuple:
a, b, c = x
a, b, c # display these values
type(a), type(b), type(c)
Sometimes we want to unpack some values together. The *
symbol serves as a special operator for tuple unpacking. The *
identifies where to put "everything else." Observe the following examples:
my_vals = (1, 2, 3, 4, 5)
my_vals
a, b, *c = my_vals
a, b, c # display these values
type(my_vals)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-73-ec1766d5fcdf> in <module> ----> 1 type(my_vals) NameError: name 'my_vals' is not defined
a, *b, c = my_vals
a, b, c # display these values
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-74-fdd1a52ba051> in <module> ----> 1 a, *b, c = my_vals 2 a, b, c # display these values NameError: name 'my_vals' is not defined
*a, b, c = my_vals
a, b, c # display these values
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-76-806b2ae5897c> in <module> ----> 1 *a, b, c = my_vals 2 a, b, c # display these values NameError: name 'my_vals' is not defined
+
operator concatenates tuples*
operator repeats tuplesin
operator returns a boolean True
if a specified element is presentTuple comparison uses lexicographical order:
Compare the first elements in each tuple, and if they differ this determines the outcome.
If they are equal, then compare the next two elements and so on, until either sequence is exhausted.
If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively.
If all items of two sequences compare equal, the sequences are considered equal.
If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.
(1, 2) + (3, 4)
(1, 2, 3, 4)
(1, 2) * 3
t = tuple('computation') # tuple() "constructor" converts to a tuple
t
('c', 'o', 'm', 'p', 'u', 't', 'a', 't', 'i', 'o', 'n')
'a' in t
'z' in t
'tat' in t
< > ==
operators work element by element
(1,2,3) < (1,2,4)
(1,2,3,4) < (1,2,4)
(1,2) < (1,2,-1)
(1, 2, 3) == (1.0, 2.0, 3.0) # comparison is element-by-element, considering values
True
sorted()
function¶Python has a built-in function called sorted()
that sorts a sequence of values.
A new sequence is returned; the original sequence is unchanged
t # original sequence
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-79-79a4af0a36c2> in <module> ----> 1 t # original sequence NameError: name 't' is not defined
sorted(t)
['a', 'c', 'i', 'm', 'n', 'o', 'o', 'p', 't', 't', 'u']
t # original sequence unchanged
('c', 'o', 'm', 'p', 'u', 't', 'a', 't', 'i', 'o', 'n')
The sorted()
function has arguments that you can use to control its behavior: https://docs.python.org/3/library/functions.html#sorted
There is also an official HOW TO tutorial on sorting: https://docs.python.org/3/howto/sorting.html#sortinghowto
sorted(t, reverse=True) # sort in reverse order
sorted?
t = ('alpha', 'bravo', 'charlie', 'delta')
t
for i in range(len(t)): # same as: for i in (0, 1, 2, 3)
print(t[i])
{}
. You can also use the dict()
function.Complete documentation on dictionaries is available from: https://docs.python.org/3/tutorial/datastructures.html?highlight=dictionary#dictionaries
my_phonebook = {'Bob': '434-2456',
'Jeff': '566-8795',
'Claire': '789-2435'}
my_phonebook
{'Bob': '434-2456', 'Jeff': '566-8795', 'Claire': '789-2435'}
type(my_phonebook)
dict
# the len() function works here too!
len(my_phonebook)
my_phonebook.get('Jeff') # using the get() method of the dictionary object
my_phonebook.get('Dave') # returns None if no entry, therefore it's "safe"
my_phonebook['Jeff'] # using direct access
'566-8795'
my_phonebook['Dave'] # not safe, error if entry does not exist
my_phonebook['Cindy'] = '467-9845' # simply assign a value to a new key
my_phonebook
# you can get an iterable sequence of the keys with the .keys() method
my_phonebook.keys() # returns a "list" of keys
# you can get an iterable sequence of the values with the .values() method
my_phonebook.values() # returns a "list" of values
# you can get an iterable sequence of the key-value pairs with the .items() method
my_phonebook.items()
for
loop can be used to iterate over dictionary keys, values, or items...# you can explicitly iterate over the keys...
for key in my_phonebook.keys():
print(key)
# when iterating over the keys,
# you can always get the associated values
for key in my_phonebook.keys():
print(key, my_phonebook.get(key))
# the default for iterating over a dictionary gives you the keys
# note: we actually didn't need the .keys() method to do this
for key in my_phonebook:
print(key, my_phonebook.get(key))
# with the items() method you can iterate over keys and values
for key, value in my_phonebook.items():
print('The phone number for %8s is %s.' % (key, value))
# because a dictionary is unordered, we often sort when iterating
# with the items() method you can iterate over keys and values
for key, value in sorted(my_phonebook.items()):
print('The phone number for %8s is %s.' % (key, value))
'Claire' in my_phonebook # works as expected
'Dave' in my_phonebook # works as expected
'789-2435' in my_phonebook # the 'in' operator works only on keys
my_phonebook['Bob'] = '555-1212' # new phone number for Bob
my_phonebook
del my_phonebook['Cindy']
my_phonebook
popped = my_phonebook.pop('Jeff')
popped
'566-8795'
my_phonebook
{'Bob': '434-2456', 'Claire': '789-2435'}
Dictionaries are objects, and therefore have a host of methods available for use. For documentation on dictionaries: https://docs.python.org/3/tutorial/datastructures.html#dictionaries
a = [x % 2 for x in range(10)]
b = [x % 3 for x in range(10)]
a_set = set(a)
b_set = set(b)
print(a,b)
print(a_set,b_set)
type(a_set)
a_set & b_set # & Intersection
a_set | b_set # | union
a_set - b_set # - difference/complement
b_set - a_set
a_set ^ b_set # ^ with xor logical operator
Let's play around with adding and removing set items
a_set.add(3)
a_set
Look what happens when we add 2 to a_set
a_set.add(2)
a_set
Note: you cannot slice sets like tuples, lists, or dictionaries.
a_set[1]
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-88-6e586b3c6acb> in <module> ----> 1 a_set[1] NameError: name 'a_set' is not defined
Sets can also support strings.
urban_animals = {'pigeon', 'dog', 'squirrel','cat'} # Set A for examples
urban_animals
urban_animals.add('rat')
urban_animals
urban_animals.remove('squirrel')
urban_animals
'rat' in urban_animals
'alley_cat' in urban_animals
We can also use sets on individual characters of a string
vader = set('dum-dum-dum-dum-ta-dum')
vader
Only check for numbers >50.
primes = 1,2,3,5,7,11,12,17,19,23,29,31,37,41,47