"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"# More Python basics\n",
"**Camilo A. Garcia Trillos - 2020**\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## In this notebook\n",
"\n",
"- we look at how to define and test *functions*, and how to think in terms of error management.\n",
"- we discuss Python packages and start looking at some notable Python packages or libraries."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"## Function definition\n",
"\n",
"Python code is better structured by defining functions.\n",
"\n",
"The basic syntaxis is *def name_function: --- return(---)*\n",
"\n",
"Let us look at a first example: we will create a function that establishes if two integers are relative primes (i.e., that their maximum common divisor is 1). We will create several iterations of the function.\n",
"\n",
"Let us start with a first simple implementation:\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 1,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"def rel_primes(x,y):\n",
" # We can include a description of the function using a string immediately below the function\n",
" ''' Receives two numbers and returns True if both are relative \n",
" primes or False otherwise\n",
" \n",
" '''\n",
" \n",
" z = min(x,y) # Assign to z the minimum from a and b\n",
" for i in range(2,z) : # run over all numbers between 2 and z (!)\n",
" if (x%i)==0 and (y%i==0): # if a number divides both a, b (i.e. the residual in both cases is zero) \n",
" # they are not relative primes ...\n",
" return False # ... in this case, return is False. This gets the flow out of the function\n",
" return True # finally, if the program gets to this point, a,b are relative primes \n",
"\n",
"# Note that the following line of code is outside the scope of the function\n",
"rel_primes(18,123) # This should be false, as 3 divides both\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Check with other cases that the function works as it should. \n",
"\n",
"In Python, when the code within a function is executed, a new 'environment' is created. Every object/function that is defined within a function only lives within the function. So for example, tha variables x,y,z above are not accessible in the following line"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"print(x,y,z) # this generates an error"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"However, functions *can* access variables and functions defined outside themselves. This is useful (as will be seen further below), but is sometimes a source of confusion (particualrly regarding variables). "
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"In the above definition of 'rel_prime, apart from the funcional part, we included a help description. At any point this information can be retrieved with *?* after the name of a function."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"rel_primes?"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"What happens when we test the function with values that are not integers?"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"rel_primes('a',1) # this gives an error"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"ename": "TypeError",
"evalue": "'float' object cannot be interpreted as an integer",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mrel_primes\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1.2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1.4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m\u001b[0m in \u001b[0;36mrel_primes\u001b[0;34m(x, y)\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mz\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Assign to z the minimum from a and b\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 9\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mz\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m \u001b[0;31m# run over all numbers between 2 and z (!)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 10\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m%\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m==\u001b[0m\u001b[0;36m0\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m%\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m==\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# if a number divides both a, b (i.e. the residual in both cases is zero)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0;31m# they are not relative primes ...\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: 'float' object cannot be interpreted as an integer"
]
}
],
"source": [
"rel_primes(1.2,1.4)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 3,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"rel_primes(3,4.2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"### Error management"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Note that sometimes we get an error and sometimes we do not. Moreover, the error is not very informative of what happened. We can create our own error messages. In what follows we create a function that 'wraps' the previous one, while providing some error management."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This one works: 18 and 123 are both integers (and not relative primes)\n"
]
},
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 4,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"def rel_primes2(a,b):\n",
" \n",
" # First we check if the inputs have the right type. Recall the type function we look at on the first notebook.\n",
" \n",
" if type(a)!=int:\n",
" raise TypeError(\"Both numbers must be integers\")\n",
"\n",
" if type(b)!=int:\n",
" raise TypeError(\"Both numbers must be integers\")\n",
" \n",
" # If no error is raised up to here, we call the original function.\n",
" \n",
" return rel_primes(a,b) #Note that we can call functions we have defined before\n",
"\n",
"print('This one works: 18 and 123 are both integers (and not relative primes)')\n",
"rel_primes2(18,123)\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"ename": "TypeError",
"evalue": "Both numbers must be integers",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mrel_primes2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'a'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m123\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Here we raise an error as we defined it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m\u001b[0m in \u001b[0;36mrel_primes2\u001b[0;34m(a, b)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m!=\u001b[0m\u001b[0mint\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Both numbers must be integers\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m!=\u001b[0m\u001b[0mint\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mTypeError\u001b[0m: Both numbers must be integers"
]
}
],
"source": [
"rel_primes2('a',123) # Here we raise an error as we defined it"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"You can check that errors are raised in other cases (for example if you provide complex numbers or floats).\n",
"\n",
"The other very common type of error to be raised is ValueError (i.e. raise ValueError(...)). This means that a given input has a value outside the accepted domain.\n",
"\n",
"It is possible also to check if a call to a function produces an error using the keword 'try', so taht any error can be managed by the programmer. This can be useful when one wants to allow "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There would be an error because 1.2 is not integer.\n",
"But with try the error is caught here. We can then assign the value we want to p, for example -1\n"
]
},
{
"data": {
"text/plain": [
"-1"
]
},
"execution_count": 1,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"a=1.2\n",
"try:\n",
" p=rel_primes2(a,1)\n",
"except:\n",
" print('There would be an error because ',a,' is not integer.')\n",
" print('But with try the error is caught here. We can then assign the value we want to p, for example -1')\n",
" p=-1\n",
" \n",
"p"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"### Testing functions\n",
"\n",
"An important part of coding is testing. It entails designing a sequence of checks to evaluate the behaviour of a function.\n",
"\n",
"The statement *assert* might be very useful for this purpose. It raises an error if a result is False.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"#Some basic testing\n",
"\n",
"assert rel_primes2(15,28), 'Failed with two large relative primes'\n",
"assert rel_primes2(2,3), 'Failed with two small relative primes'\n",
"assert not rel_primes2(15,25), 'Failed with two small numbers that are not relative primes'\n",
"assert not rel_primes2(2,4), 'Failed with two small numbers that are not relative primes'"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Note that we performed 4 tests. The last one fails and illustrates that the function defined above is not working properly. When the test fails, the associated fail message is displayed.\n",
"\n",
"We proceed to fix the error (which is located on the rel_prime function). Run the code below and then run the tests again."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"def rel_primes(a,b):\n",
" # This is a corrected version\n",
" ''' Receives two numbers and returns True if both are relative \n",
" primes or False otherwise\n",
" \n",
" '''\n",
" \n",
" for i in range(2,min(a,b)+1) : # The error was here\n",
" if (a%i)==0 and (b%i==0): \n",
" return False \n",
" return True "
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"What happened? We have fixed the error on the rel_primes function (we were not including the las element in the cycle). Since the function rel_primes2 calls rel_prime, this one gets fixed as well. This helps it pass all the tests.\n",
"\n",
"Note we have learnt something very important in addition: in jupyter, code depends on the order of *execution*, not the order in which the code is written.\n",
"\n",
"\n",
"**Remark:** In more professional settings, the preferred form of testing is via unittesting. If you want to learn more about it, read the [Python documentation on unittest](https://docs.python.org/3/library/unittest.html)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"### Lambda functions\n",
"\n",
"We had seen that to define a function in Python, we use the command def followed by the name of the function, arguments and colon. There is an alternative in the form of lambda functions, that is useful to define inline functions.\n",
"\n",
"The syntax is *name_function = lambda (vars): operations *\n",
"\n",
"\n",
"Here is an example where we implement the same function twice.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"def square(x):\n",
" return x*x\n",
"\n",
"square2 = lambda y: y*y"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"assert square(100)==square2(100)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Lambda functions are very convenient for short tasks. In particular it is an easy way to encapsulate one line instructions in a function. Note, though that it is very hard "
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Some final observations: \n",
"\n",
"- Functions are not forced to return a value (sometimes these are called *procedures*)\n",
"- More testing tools are provided on the [package](## 2. Packages) *unitest* \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"## Packages"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Python comes with many functions already defined. Some of them come as part of the standard language (we have encountered some of them). However, the real power of Python comes from sets of functions put together in *packages*. \n",
"\n",
"Here are some of the ones we will use in this course (and are very useful in finance):\n",
"\n",
"- **math** : some mathematical functions \n",
"- **numpy** : vector and matrix capabilities and operations\n",
"- **scipy** : numerical scientific computing including integration, fixed points, solving ODEs, optimisation, …)\n",
"- **matplotlib** : plots\n",
"- **pandas** : database access and manipulation, and more plots routines\n",
"- **statutils**: Some statistical tools including test of hypothesis\n",
"\n",
"Some packages we will not use but sare very useful in finance include:\n",
"- **keras**: Keras is a high-level neural networks library\n",
"- **sklearn**: A library with tools for data mining and data analysis\n",
"\n",
"\n",
"Here and in the nest notebook, we look at *numpy* and *math*. We will learn about the following packages while making applications ion finance.\n",
"\n",
"Packages must be imported into the kernel we are excuting. By convention all imports should be done at the start of the program, and in the Jupyter case at the start of the notebook."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"# By convention this should be placed at the top of the file. But it can be used anywhere \n",
"import math # import the math package\n",
"import numpy as np # import the numpy package and create an alias for it 'np'\n",
"from math import sin, exp # import only the functions sin and exp that are located on the math package"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"After running the above code, we can use **all** functions on the math package. Here are some examples:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-2.4492935982947064e-16 \n",
" -5.0\n"
]
}
],
"source": [
"x = sin(2*math.pi) # Note we can use the function sin, but the constant pi has to be called from the math package as was not explicitly imported\n",
"y = math.log(exp(-5))\n",
"print(x,'\\n',y)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Some examples with numpy\n",
"\n",
"Let us now look at numpy. Numpy is a scientific library taht has been optimised to perform vector and matrix operations.\n",
"\n",
"We start by looking at how to create numpy objects. We can either transform another structure (for example a list) using the function *array*, or wwe can use one of the functions producing directly an array. Here are some examples:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a: [3 4 5] b: [3 4 5] c: [3. 4. 5.]\n",
"\n"
]
},
{
"data": {
"text/plain": [
"array([3, 4, 5])"
]
},
"execution_count": 13,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"a = np.array([3,4,5]) # an array with the numbers 3,4,5 \n",
"b = np.arange(3,6) # an arary with the numbers 3,4,5\n",
"c = np.linspace(3,5,3) # an array with the numbers 3.,4.,5.\n",
"\n",
"print('a:',a,' b:',b, 'c:',c)\n",
"print(type(a))\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Note that the above objects are of class *array*, which was defined in the package numpy. Observe also how the result is printed if no print function is invoked. \n",
"\n",
"Let us look at some simple operations before arrays:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1. a==b: [ True True True]\n",
"2. a+b: [ 6 8 10]\n",
"3. a*b: [ 9 16 25]\n",
"4. a/b: [1. 1. 1.]\n",
"5. a^b: [ 27 256 3125]\n",
"6. a==c: [ True True True]\n"
]
}
],
"source": [
"#Most operations are done piecewisely:\n",
"print('1. a==b:', a==b)\n",
"print('2. a+b:', a+b)\n",
"print('3. a*b:', a*b)\n",
"print('4. a/b:', a/b)\n",
"print('5. a^b:', a**b)\n",
"print('6. a==c:', a==c)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Note the difference with respect to lists: the operator *+* means vector addition. Note also that most operations like '*' and '/' are defined pointwise.\n",
"\n",
"Since operations are mostly pointwise, the sizes of vectors need to coincide or an error will be raised."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3 4 5 6 7 8 9] [3 4 5]\n"
]
},
{
"ename": "ValueError",
"evalue": "operands could not be broadcast together with shapes (10,) (3,) ",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0ma2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0ma2\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mb\u001b[0m \u001b[0;31m# This raises an error\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (10,) (3,) "
]
}
],
"source": [
"a2 = np.arange(10)\n",
"print(a2,b)\n",
"a2*b # This raises an error"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"We also have some vector and matrix operations. We can, for example, find the dot product of two vectors of the same size with the operator *@*"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"50 50\n"
]
}
],
"source": [
"# Dot product of a and b\n",
"a_dot_b = a@b\n",
"# An alternative way of calculating it\n",
"a_dot_b2 = (a*b).sum() # Sum is a method available for arrays\n",
"print(a_dot_b, a_dot_b2)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"We will frequently make use of the possibility to generate (pseudo) random numbers following some given distributions. Numpy allows for this through its sub-module *random*. We can generate (pseudo-)random arrays and matrices of a given size. Here are some examples using the current prefered way of calling the generator function."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Uniform 2x2\n",
"[[0.87315032 0.66478065]\n",
" [0.20504409 0.30908408]]\n",
"(2, 2)\n",
"4\n",
"====\n",
"Gaussian 5x3\n",
"[[-1.24004184 1.51037019 0.58394874]\n",
" [ 0.00217341 -0.94616658 -0.29493058]\n",
" [-0.82462691 -0.57902036 0.28046327]\n",
" [ 0.22246655 -0.61206991 -1.19467205]\n",
" [-1.52000388 0.00257337 -0.77046294]]\n",
"(5, 3)\n",
"15\n"
]
}
],
"source": [
"#Random numbers\n",
"\n",
"#Initialise generator\n",
"rng = np.random.default_rng()\n",
"\n",
"#Uniform random numbers\n",
"print('Uniform 2x2')\n",
"c = rng.random((2,2)) # This creates a matrix of 2 x 2 of independent U[0,1] random numbers\n",
"print(c)\n",
"print (c.shape)\n",
"print(c.size)\n",
"\n",
"print(\"====\")\n",
"\n",
"print('Gaussian 5x3')\n",
"c = rng.standard_normal((5,3)) # This creates a matrix of 5 x 3 of independent standard Gaussian random numbers\n",
"print(c)\n",
"print (c.shape)\n",
"print(c.size)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"This produces arrays in dimension 2 or matrices. We can also see the second use of operator '@': matrix (and matrix-vector) multiplication:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 5.24109897, -5.25279897, -3.38764582, -7.75424022, -8.40203282])"
]
},
"execution_count": 19,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"c@(a.T) # This is the result of multiplying the matrix c (Gaussian matrix) and the vector a"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"The random generator can produce samples from different distributions: look at the help for rng.normal, rng.lognormal, rng.exponential...."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[32.95070156, 4.18834104, 37.53819265, 42.65346581, 8.19730704],\n",
" [38.69587919, 16.06854485, 30.7728049 , 2.88131377, -6.0297293 ]])"
]
},
"execution_count": 22,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"rng.normal(5,25,[2,5])"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"rng.exponential?"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"**Remark:** Once imported, we can make use of numpy functions within our function definitions. We can also import a module *within* a function definition, however in that case, the imported modules are available only within that function."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"\n",
"## Exercises\n",
"\n",
"1. Create a function that receives a positive integer $n$ and a probability $p \\in (0,1)$, and returns the mean and standard deviation of a binomial distribution with these parameters.\n",
"Your function must raise errors whenever the probability is outside the given range and if that the number n is not and integer greater than 0.\n",
"Use the statement *assert* to test your function on some known cases.\n"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"from math import sqrt\n",
"def binominal(n,p):\n",
" if p<0 or p>1:\n",
" raise ValueError(\"p is a probability and must be within (0,1)\") \n",
" if type(n)!= int:\n",
" raise TypeError(\"n must be integer\")\n",
" if n<=0:\n",
" raise ValueError(\"n must be positive\")\n",
" \n",
" return n*p,sqrt(n*p*(1-p))\n",
" \n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(1.0, 0.7071067811865476)"
]
},
"execution_count": 37,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"# assert binomial(2,0.5)==(1,sqrt(0.5))\n",
"binominal(2,0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"2. Using np.rand, create a function that receives a positive integer $n$ and a probability $p \\in (0,1)$, and returns an array with $n$ bernoulli trials with parameter $p$. \n",
"Your function must raise errors whenever the probability is outside the given range and if that the number n is not and integer greater than 0.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([1., 1., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 1.,\n",
" 0., 1., 1., 1., 0., 1., 0., 1., 1., 0., 1., 0., 0., 0., 1., 1., 1.,\n",
" 0., 0., 1., 0., 1., 1., 0., 0., 1., 0., 0., 0., 1., 1., 0., 1., 1.,\n",
" 0., 0., 0., 1., 1., 1., 0., 1., 0., 1., 1., 1., 0., 0., 0., 1., 1.,\n",
" 1., 0., 1., 0., 0., 1., 1., 0., 0., 1., 1., 0., 0., 0., 1., 1., 1.,\n",
" 1., 0., 0., 1., 1., 0., 1., 1., 1., 0., 0., 1., 0., 1., 1.])"
]
},
"execution_count": 50,
"metadata": {
},
"output_type": "execute_result"
}
],
"source": [
"def bernoulli_trials (n,p):\n",
" if p<0 or p>1:\n",
" raise ValueError(\"p is a probability and must be within (0,1)\") \n",
" if type(n)!= int:\n",
" raise TypeError(\"n must be integer\")\n",
" if n<=0:\n",
" raise ValueError(\"n must be positive\")\n",
" \n",
" mygen = np.random.default_rng() \n",
" return 1.*(mygen.random(n)<=p)\n",
" \n",
"bernoulli_trials(100,0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"3. Using the previous function, estimate the empirical mean and variance for n=[10,100,1000,10000]. Comment."
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Population values\n",
" \t\t Mean \t\t Var\n",
" \t\t 0.25 \t\t 0.1875\n",
"Enmpirical values\n",
"n \t\t Mean \t\t Var\n",
"1 \t\t 0.0 \t\t 0.0\n",
"10 \t\t 0.7 \t\t 0.21000000000000002\n",
"100 \t\t 0.37 \t\t 0.2331\n",
"1000 \t\t 0.265 \t\t 0.194775\n",
"10000 \t\t 0.2531 \t\t 0.18904039\n"
]
}
],
"source": [
"# This is a crude Monte Carlo. As the number of samples grows, we should approximate the population mean and variance.\n",
"\n",
"#Set p\n",
"p=0.25\n",
"\n",
"\n",
"print('Population values')\n",
"print(' \\t\\t Mean \\t\\t Var')\n",
"print(' \\t\\t', p, '\\t\\t', p*(1-p))\n",
"\n",
"print('Enmpirical values')\n",
"print('n \\t\\t Mean \\t\\t Var')\n",
"\n",
"for i in range(5):\n",
" n = 10**i\n",
" aux = bernoulli_trials(n,p)\n",
" emp_mean = aux.sum()/n\n",
" emp_var = emp_mean*(1 - emp_mean)\n",
" print(n,'\\t\\t',emp_mean,'\\t\\t',emp_var)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (system-wide)",
"language": "python",
"metadata": {
"cocalc": {
"description": "Python 3 programming language",
"priority": 100,
"url": "https://www.python.org/"
}
},
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {
},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {
},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}