CoCalc Public FilesPhys205 Computational Physics / ComputerClassesStudent / Phys205-Week01 / Phys205-Week01-Student.ipynbOpen with one click!

1

Computational Physics - Week 1: >>

-Table of contents week 1: >>

-Introduction to week 1: >>

-Installing Python and Jupyter software: >>

-Jupyter Notebooks revisited - cells: >>

-Markdown cells: >>

--Entering text and tables: >>

--Including links: >>

--Figures: >>

--Entering formulae: >>

-Pandas: >>

--Reading data into Pandas: >>

--Week 1 exercise 1: >>

-Data analysis: >>

--Plotting with matplotlib: >>

--Fitting with least_squares: >>

--Week 1 exercise 2: >>

--Week 1 exercise 3: >>

--Week 1 exercise 4: >>

2

The Computational Physics module (Phys205) builds on the knowledge you acquired in the Introduction to Computational Physics module (Phys105) you took in your first year. The first few weeks follow a format similar to that used in Phys105: there will be a lecture that introduces an aspect of Python and then a computer lab in which you will work through a Jupyter Notebook that contains some exercises. Demonstrators will be available to help with any difficulties you have, and you should get one of them to mark your work when you have finished it.

The second part of the course, which will start later this semester, consists of a set of short projects that you will tackle in groups of about three. Example projects could include determining Feigenbaum's constant by investigating the behaviour of the logistic map and writing a Monte Carlo to simulate the determination of $\pi$ using raindrops. Again, demonstrators will be available to help you with any technical problems you have. The projects will be assessed through short individual reports, which should be written as Jupyter Notebooks. If you have any ideas for projects you would like to try, discuss them with me and we will decide if they are suitable!

The final section of the course, which will run in semester 2, consists of longer projects. You will work on these in groups of six to eight. Possible topics include modelling traffic flow in the Birkenhead tunnel, and investigating the relationship between the availability of prey and the size and number of predators in a range of habitats. You will be asked to produce an initial project plan, describing how you plan to tackle the problem you have been given, and a group report at the end of the project. You will also give a presentation explaining your project to the other Phys205 groups. Again, if you have any ideas for topics you would like to tackle, please discuss them with me!

Today, we will start with a reminder on how to install the tools we will be using in the course (the Python programming language and Jupyter Notebooks) and then look at how data can be imported into a Python program and analysed using the Pandas package. (We will look at Pandas in more detail next week.) We will finish by revisiting the problem of fitting data using the `least_squares`

routine from the SciPy library, which you will be using extensively in Practical Physics II (Phys206).

Today, and in all our Phys205 sessions, please read through the relevant sections of the Notebook before you try the problems. (Of course, you can miss out revision topics, like how to use Markdown, if you are familiar with them!). If you don't understand anything in the Notebook, or have difficulties with the problems, ask one of the demonstrators for help. You can also consult the recommended textbooks (A student's guide to Python for physical modelling, or Learning scientific programming with Python), both of which are available in the libary. Alternatively, ask Google - there is lots of information on Python and Jupyter Notebooks out there!

3

Python and Jupyter Notebooks are available on the CoCalc system will be using in the Phys205 computing sessions (https://cocalc.com).

If you want to use Python and Jupyter Notebooks on your own computer, you can install the software required for Windows, Mac or Linux PCs as follows:

Go to the Anaconda web site https://www.anaconda.com/products/individual. Select Windows (64 or 32 bit), macOS or Linux, depending on the operating system running on your computer. Download the version of Anaconda labelled Python 3.8. Follow the installation instructions on the web site.

Once Anaconda is installed, open it and launch Jupyter Notebook to get started:

- On Windows, click the "Anaconda3" icon in the start menu and then "Jupyter Notebook".
- On Linux, open a terminal window , type in the command "jupyter notebook &" and presss
*Enter*. - On a Mac, click on "Anaconda-Navigator" in the LaunchPad and then "Jupyter Notebook" or open a terminal window, type in the command "jupyter notebook &" and press
*Enter*.

Jupyter Notebooks consist of cells in which nicely formatted documents can be written and cells which contain computer code. The language used to create the document cells is called Markdown, and we will use the Python in the code cells.

When you start a new Notebook, its first cell will be a code cell and any new cell you create will be a code cell by default. You can create a new cell below the current cell by clicking the "+" symbol in the Notebook menu bar or by using the *Insert* menu. (Click on *Insert*, then on *Insert Cell Above*, or *Insert Cell Below*, as appropriate.) To change a cell's type to Markdown, select the cell (by clicking in it) and then click *Cell, Cell Type* and *Markdown*. Alternatively, select the cell, press *Esc*, then *m* (for Markdown). You will have to click inside the cell (or press *Enter*) before it is selected and you are able to type in it!

A Markdown cell can be changed to a code cell using the *Cell* menu, or by typing *Esc, y*.

Cells can be deleted by selecting them and then using the *Edit* menu, or by pressing *Esc, d, d*.

Markdown is a way of writing nicely formatted text, allowing the inclusion of pictures, web links, videos and other features in your Notebooks.

To see how this cell was written using Markdown, double click on it.

To "run" or "compile" the cell, so the text is formatted nicely, select the cell (click in it) and press *Ctrl + Enter* (i.e. press the *Ctrl* and *Enter* keys at the same time, *Command + Enter* on a Mac). Alternatively, you can click on the *Run* button on the menu bar. If you press *Shift + Enter*, you will run the current cell and move to the next cell (or create a new cell below the current one if there isn't one already there).

Flipping between looking at the Markdown (double click) and the compiled cell (*Ctrl + Enter*) will allow you to see how Markdown can be used.

If you want to write some normal text, just type it into the cell. If you want to emphasize the text, you can *write in italics* (using asterisks before and after the section that you want to be in italics), or **make it bold** (using double asterisks). (There are *alternative ways of getting italics* and of **entering bold text**, using single and double underscores.) You can also ~~cross out text~~ (using two ~ symbols before and after the text to be deleted).

If you want a new paragraph, press *Enter* twice, so you produce an empty line.

If you want to start a new line without having a new paragraph, leave two spaces at the end of the line, like this:

This is a new line.

So is this.

If you want a numbered list, do this:

- This is the first entry.
- This is the second.
- And this the third.

Bulleted lists are also easy to produce:

- This is the first bullet.
- And this the second.

If you want to present information in a table, use the following syntax (double click to see the Markdown!):

Number | Angle (degrees) | Cosine of angle |
---|---|---|

0 | 0 | 1 |

1 | 30 | 0.866 |

2 | 45 | 0.707 |

3 | 60 | 0.5 |

4 | 90 | 0.0 |

**Table 1** *The value of the cosine of several angles.*

(You don't have to line up the Markdown columns; when you run the cell that will be done automatically, but it does make reading and editing the Markdown easier!) Notice that you have to enter the caption "by hand" below the table; table and figure numbers are not entered automatically.

If you want a title or heading, use the hash symbol. (One hash is a title, two a heading, three a subheading, etc.) For example, here is a subheading...

Links to pages and videos on the web can be included as is done above (and here) for the introduction to Markdown. Alternatively, the link can be direct, e.g. https://www.liverpool.ac.uk.

Links can also be "reference style". This is where you can find the UK google home page. You can put the link at the end of the paragraph or cell. This makes the Markdown a little easier to read. There is no visible difference in the text that results when you run the cell.

Links can be made to sections or figures in the Notebook. For example, back to the section Entering text and tables. Double clicking this cell will allow you to see how the section is labelled (using a "#" followed by the section title with the spaces replaced by hyphens, and how a link to that label is created: the label is not allowed to contain any spaces!).

Images stored on your computer can be added using the syntax used below (double click the cell to see it!). The first example uses the path relative to the folder/directory in which the Notebook is saved:

**Figure 1** *Cosine as a function of angle*

The text between the quotation marks ("Plot of cosine") is what is shown if you hover over the image. The absolute path can also be used in Markdown as is shown below...though you can't go back further than the Jupyter start folder/directory. These images are not automatically incorporated if you convert your markdown to pdf, so the following section is "commented out". (Anything between the "<", "!" and "--" and the "--" and ">" symbols is treated as a comment, i.e. an explanatory remark, and is not displayed when the Markdown is compiled.)

Figures from the web can be added in the markdown using the syntax shown below.

**Figure 1.1** *This figure shows the GitHub logo*

(Again, this has been commented out, as there is no automatic location and conversion of web images to pdf, so the above lines would have spoiled the pdf version of this Notebook.)

Note, if you want to create a pdf file of this Notebook including the above image, you will have to upload the image to CoCalc and use a local link!

Mathematical formulae are entered using LaTeX. How this can be done is best illustrated with a few examples. A formula can be entered "inline" like this: $E = mc^2$. (Notice how the dollar signs "bracket" the mathematical formula.)

Formulae can also be placed on their own line in the centre of the page using two dollar symbols:

$\sin^2 x + \cos^2 x = 1.$

Notice how functions like $\sin$ and $\cos$ (and $\log$, $\exp$ etc.) are entered, and how superscripts ($x^2$) and subscripts ($p_0$) can be obtained. If you need more than one character in a superscript (or subscript), enclose the relevant section in curly brackets, e.g. $x_{max}$.

Note, it is important in Markdown to ensure there are no spaces between the "$" and the characters in inline formulae. Although spaces are allowed by the LaTeX standard, they can cause problems when you try and run LaTex on Notebooks, e.g. when using *File, Export Notebook as..., LaTeX*, or *File, Export Notebook as..., PDF*.

Fractions are obtained like this $\frac {1}{2}$, and a vast array of symbols is available, for example: $\sqrt {2}$, $2 \approx 2.5$, $\int_0^1 x^2 dx$, $\sum_{n = 0}^{15} x_n$. (See here for more.) Brackets which expand to the required height can be entered using $\left( \right)$ or $\left[ \right]$. Greek letters are written $\alpha$, $\beta$ etc. An example combining some of these features is the formula for the Fermi function:

$F(\epsilon ) = \frac{1}{\exp \left [ \frac{\epsilon - \mu}{kT} \right ] + 1}.$

Finally, "inline" segments of computer code can be indicated using reverse quotes, for example: `print("This is a Python 3 print statement")`

. Three reverse quotes, with a tag indicating the relevant language, can be used to produce blocks of code. We shall be interested in Python, so we will be seeing things like:

```
import numpy as np
import matplotlib.pyplot as plt
#
print("This illustrates how Markdown can be used to format a block of code")
#
for n in range(0, nMax + 1):
print("n =",n)
```

And that's about all we need to know about Markdown!

Of course, the power of Jupyter Notebooks is that they allow documents and figures in Markdown cells like this one to be combined with computer code. We will now exploit this by taking a first look at the Pandas package.

4

Pandas is a Python library that provides high-performance, easy-to-use data structures and data analysis tools. It can be thought of as a powerful version of Excel, with a lot more features! It is available in CoCalc (and should also be available on your computer as part of the Anaconda3 package if you have installed that), so in order to use it we just need to import it. We will also import `numpy`

and `matplotlib.pyplot`

, as we will need these, and use the `%matplotlib inline`

"magic" command to ensure our plots appear in the Notebook:

5

In [15]:

# <!-- Student --> import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline

6

The most useful Pandas data structures are known as DataFrames. Data can be read into them from a wide range of formats including from comma-separated-variable (csv) files, Excel spreadsheets (xls or xlsx) and the web (html). Here, we will read an Excel (xlsx) file into a DataFrame called `df`

in our Notebook.

7

In [16]:

#<!-- Student --> df = pd.read_excel("Spectrum01.xlsx") df

8

Unnamed: 0 | xData | xError | yData | yError | |
---|---|---|---|---|---|

0 | 0 | 0.000000 | 0.048113 | 3.632305 | 1.905861 |

1 | 1 | 0.172414 | 0.048113 | 1.791350 | 1.338413 |

2 | 2 | 0.344828 | 0.048113 | 3.368522 | 1.835353 |

3 | 3 | 0.517241 | 0.048113 | 4.225319 | 2.055558 |

4 | 4 | 0.689655 | 0.048113 | 2.573355 | 1.604168 |

5 | 5 | 0.862069 | 0.048113 | 8.720353 | 2.953024 |

6 | 6 | 1.034483 | 0.048113 | 7.855142 | 2.802703 |

7 | 7 | 1.206897 | 0.048113 | 4.901981 | 2.214042 |

8 | 8 | 1.379310 | 0.048113 | 4.629510 | 2.151630 |

9 | 9 | 1.551724 | 0.048113 | 8.291520 | 2.879500 |

10 | 10 | 1.724138 | 0.048113 | 16.539529 | 4.066882 |

11 | 11 | 1.896552 | 0.048113 | 44.409078 | 6.664014 |

12 | 12 | 2.068966 | 0.048113 | 132.682155 | 11.518774 |

13 | 13 | 2.241379 | 0.048113 | 151.545870 | 12.310397 |

14 | 14 | 2.413793 | 0.048113 | 86.623757 | 9.307188 |

15 | 15 | 2.586207 | 0.048113 | 25.881347 | 5.087371 |

16 | 16 | 2.758621 | 0.048113 | 14.013225 | 3.743424 |

17 | 17 | 2.931034 | 0.048113 | 11.686240 | 3.418514 |

18 | 18 | 3.103448 | 0.048113 | 10.783761 | 3.283864 |

19 | 19 | 3.275862 | 0.048113 | 13.621517 | 3.690734 |

20 | 20 | 3.448276 | 0.048113 | 11.890779 | 3.448301 |

21 | 21 | 3.620690 | 0.048113 | 18.899116 | 4.347311 |

22 | 22 | 3.793103 | 0.048113 | 13.685586 | 3.699403 |

23 | 23 | 3.965517 | 0.048113 | 18.520957 | 4.303598 |

24 | 24 | 4.137931 | 0.048113 | 19.972319 | 4.469040 |

25 | 25 | 4.310345 | 0.048113 | 29.222153 | 5.405752 |

26 | 26 | 4.482759 | 0.048113 | 17.229880 | 4.150889 |

27 | 27 | 4.655172 | 0.048113 | 26.286534 | 5.127039 |

28 | 28 | 4.827586 | 0.048113 | 25.967242 | 5.095806 |

29 | 29 | 5.000000 | 0.048113 | 30.105888 | 5.486883 |

We have the data we want, but also an extra column full of row indices, and a column header **Unnamed:0** that we don't want. We will see next week how we can delete (and add) columns and rows in DataFrames. For now, we modify the way we read in the data so we don't get the unwanted material in the first place!

9

In [17]:

#<!-- Student --> df = pd.read_excel("Spectrum01.xlsx", index_col = 0) df

10

xData | xError | yData | yError | |
---|---|---|---|---|

0 | 0.000000 | 0.048113 | 3.632305 | 1.905861 |

1 | 0.172414 | 0.048113 | 1.791350 | 1.338413 |

2 | 0.344828 | 0.048113 | 3.368522 | 1.835353 |

3 | 0.517241 | 0.048113 | 4.225319 | 2.055558 |

4 | 0.689655 | 0.048113 | 2.573355 | 1.604168 |

5 | 0.862069 | 0.048113 | 8.720353 | 2.953024 |

6 | 1.034483 | 0.048113 | 7.855142 | 2.802703 |

7 | 1.206897 | 0.048113 | 4.901981 | 2.214042 |

8 | 1.379310 | 0.048113 | 4.629510 | 2.151630 |

9 | 1.551724 | 0.048113 | 8.291520 | 2.879500 |

10 | 1.724138 | 0.048113 | 16.539529 | 4.066882 |

11 | 1.896552 | 0.048113 | 44.409078 | 6.664014 |

12 | 2.068966 | 0.048113 | 132.682155 | 11.518774 |

13 | 2.241379 | 0.048113 | 151.545870 | 12.310397 |

14 | 2.413793 | 0.048113 | 86.623757 | 9.307188 |

15 | 2.586207 | 0.048113 | 25.881347 | 5.087371 |

16 | 2.758621 | 0.048113 | 14.013225 | 3.743424 |

17 | 2.931034 | 0.048113 | 11.686240 | 3.418514 |

18 | 3.103448 | 0.048113 | 10.783761 | 3.283864 |

19 | 3.275862 | 0.048113 | 13.621517 | 3.690734 |

20 | 3.448276 | 0.048113 | 11.890779 | 3.448301 |

21 | 3.620690 | 0.048113 | 18.899116 | 4.347311 |

22 | 3.793103 | 0.048113 | 13.685586 | 3.699403 |

23 | 3.965517 | 0.048113 | 18.520957 | 4.303598 |

24 | 4.137931 | 0.048113 | 19.972319 | 4.469040 |

25 | 4.310345 | 0.048113 | 29.222153 | 5.405752 |

26 | 4.482759 | 0.048113 | 17.229880 | 4.150889 |

27 | 4.655172 | 0.048113 | 26.286534 | 5.127039 |

28 | 4.827586 | 0.048113 | 25.967242 | 5.095806 |

29 | 5.000000 | 0.048113 | 30.105888 | 5.486883 |

That looks better!

11

Use Excel (on your own computer!) to create a spreadsheet called TimeTable.xlsx with columns labelled with the days of the week and two rows, one labelled "morning" and the other "afternoon". Enter a one in each of the cells in which you have a lecture, problems class or laboratory and a zero where you are free. Read the spreadsheet into a DataFrame called dfTimeTable in the cell below this one and then display it.

12

In [18]:

print("dfTimeTable:") dfTimeTable = pd.read_excel("TimeTable.xlsx", index_col = 0) dfTimeTable

13

dfTimeTable:

Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | |
---|---|---|---|---|---|---|---|

Morning | 1 | 1 | 0 | 1 | 0 | 0 | 0 |

Afternoon | 1 | 0 | 0 | 1 | 1 | 0 | 0 |

Last year, we used Python to visualise data in various types of plots using `matplotlib.pyplot`

and to fit data using the `least_squares`

routine from the SciPy library. We will be doing this again in the year two laboratories (Phys206), so we will do some quick revision on plotting and fitting here.

Let's first plot the data from Spectrum01.xslx we have read into our DataFrame `df`

. We will see next week how we can plot directly from Pandas. Here, we will extract the data from the DataFrame so we can use the plotting procedures we learnt about last year. Let's look at one of the columns from the DataFrame:

14

In [19]:

#<!-- Student --> df.xData

15

0 0.000000
1 0.172414
2 0.344828
3 0.517241
4 0.689655
5 0.862069
6 1.034483
7 1.206897
8 1.379310
9 1.551724
10 1.724138
11 1.896552
12 2.068966
13 2.241379
14 2.413793
15 2.586207
16 2.758621
17 2.931034
18 3.103448
19 3.275862
20 3.448276
21 3.620690
22 3.793103
23 3.965517
24 4.137931
25 4.310345
26 4.482759
27 4.655172
28 4.827586
29 5.000000
Name: xData, dtype: float64

We see that we get the data we want, but also the indices which we don't. If we turn the column into a numpy array, we get the format we need for making a plot:

16

In [20]:

#<!-- Student --> xData = np.array(df.xData) xError = np.array(df.xError) yData = np.array(df.yData) yError = np.array(df.yError) print(yData)

17

[ 3.63230537 1.79135007 3.36852232 4.22531884 2.57335463
8.72035345 7.85514181 4.90198074 4.62951031 8.29152015
16.53952863 44.4090777 132.68215463 151.54587047 86.62375697
25.88134687 14.01322529 11.68624003 10.78376086 13.62151677
11.8907788 18.89911571 13.68558625 18.52095703 19.97231885
29.22215296 17.22987975 26.28653374 25.96724159 30.10588789]

Now we can create a figure, define the axes we need and make a plot of the data.

18

In [21]:

#<!-- Student --> # plt.figure(figsize = (8, 6)) plt.errorbar(xData, yData, xerr = xError, yerr = yError, color = 'r', marker = 'o', linestyle = '') plt.title('Data from DataFrame') plt.xlabel('x') plt.ylabel('y') plt.grid(color = 'g') plt.savefig('Spect01.png') plt.show()

19

The next step is to fit this data. From the plot, we can see that it looks like a peak on top of some background. The data could perhaps be the measurements of the photon energy spectrum produced when a radiaoctive material decays. We will assume the peak has a gaussian shape and that the background can be described by a second order polynomial. The formula for a gaussian function is:

$G(x) = \frac{N}{\sqrt{2\pi}\sigma}\exp\left[ -\frac{(x - \mu)^2}{2 \sigma^2}\right].$

Here, $N$ describes the number of events in the gauussian peak, $\mu$ is the central (mean) value of the peak and $\sigma$ its standard deviation.

The second order (background) polynomial can be written:

$P(x) = a + bx + cx^2.$

We want to find the values of the parameters $N$, $\mu$, $\sigma$, $a$, $b$ and $c$ which best describe the data. That is, we want to choose these values so that the function $F(x) = G(x) + P(x)$ is as close to all of the measured points as possible. We could do this by finding the parameters which minimise the distance between the measured points $\left(x_i \pm \varepsilon_{x,i}, y_i \pm \varepsilon_{y,i} \right)$ and the value of the function at those points $\left(x_i, F(x_i) \right)$. That is, choose $N$, $\mu$, $\sigma$, $a$, $b$ and $c$ so that $\sum \left| y_i - F(x_i) \right|$ is as small as possible.

20

Explain why we have to minimise $\sum \left| y_i - F(x_i) \right|$ rather than $\sum y_i - F(x_i)$.

21

Choosing to minimise the sum of differences may not only include positive numbers. If data points go over and under the chosen function (giving negative and positive values from the difference) the sum would be closer to zero giving a false impression that the function closely matches the data points. Using the modulus keeps any differences between the data points and the function positive so the sum will give an accurate representation of how closely the function matches the data points given.

22

In which of the Python functions below:

- Is $\frac{\partial F}{\partial x}$ calculated?
- Is $\chi = \frac{y - F(x)}{e}$ calculated?

23

In [1]:

# <!-- Student --> from scipy.optimize import least_squares # def fitFunc(p, x): ''' Gaussian plus polynomial (order 2) ''' f = p[0]/(np.sqrt(2*np.pi)*p[2])*np.exp(-(x - p[1])**2/(2*p[2]**2)) + p[3] + p[4]*x + p[5]*x**2 return f # 1)Calculates derivative with respect to x ----------------------------------------------------- def fitFuncDiff(p, x): ''' Differential of fit function ''' df = p[0]/(np.sqrt(2*np.pi)*p[2])*(x - p[1])/p[2]**2*np.exp(-(x - p[1])**2/(2*p[2]**2)) + p[4] + 2*p[5]*x return df # 2)Calculates Chi ----------------------------------------------------------------------------- def fitChi(p, x, y, xerr, yerr): ''' Calculation of chi for fit function and data ''' chi = (y - fitFunc(p, x))/(np.sqrt(yerr**2 + fitFuncDiff(p, x)**2*xerr**2)) return chi

24

We provide initial values of the `nParams`

parameters in `p`

then run the fit. The better our initial guesses are, the more likely the fit is to find the correct values. We can use the plot we have made to help us pick sensible values. E.g. the mean of the gaussion `mu`

can be seen to be about 2.5. We can also provide upper and lower bounds for the values of `p`

usung the arrays `lBounds`

and `uBounds`

. The values used here ($\pm\infty$) clearly have no effect! We do not have to give bounds; if they are not needed, just omit the argument `bounds`

from the call to `least_squares`

.

25

In [23]:

# <!-- Student --> # # Set initial values of fit parameters nParams = 6 pInit = [50.0, 2.0, 0.2, 1.0, 1.0, 1.0] #lBounds = [-np.inf, -np.inf, -np.inf, -np.inf, -np.inf, -np.inf] #uBounds = [np.inf, np.inf, np.inf, np.inf, np.inf, np.inf] out = least_squares(fitChi, pInit, args = (xData, yData, xError, yError)) #

26

The values returned by `least_squares`

are all contained in the object `out`

. The first one we look at we label `fitOK`

. This is a boolean variable that tells us if the fit has run. We also transfer the fitted parameters into the array `pFinal`

. This contains the fitted values of the parameters $N$, $\mu$, $\sigma$, $a$, $b$ and $c$ in the same order that we used to `p`

.

We also calculate and print out the values of the statistics $\chi^2$ and $\chi^2$/NDF which describe how well the fitted function matches the data. We calculate the errors on the fitted parameters by inverting the squared Jacobian matrix to get the covariance matrix `covar`

. The diagonal components of `covar`

are the squared errors of the fitted parameters. E.g. the error on $N$ is `covar[0, 0]`

, the error on $\mu$ is `covar[1, 1]`

, the error on $\sigma$ is `covar[2, 2]`

etc.

Finally, we plot the data again and add a line describing the fitted function.

27

In [24]:

# <!-- Student --> # fitOK = out.success # # Test if fit failed if not fitOK: print(" ") print("Fit failed") else: # # Fit worked, so get output pFinal = out.x N = pFinal[0] mu = pFinal[1] sigma = pFinal[2] a = pFinal[3] b = pFinal[4] c = pFinal[5] # # Calculate chis**2 per point, summed chi**2 and chi**2/NDF chisqArr = fitChi(pFinal, xData, yData, xError, yError)**2 chisq = np.sum(chisqArr) nPoints = len(xData) NDF = nPoints - nParams redchisq = chisq/NDF # np.set_printoptions(precision = 3) print(" ") print("Fit quality:") print("chisq per point = \n",chisqArr) print("chisq = {:5.2f}, chisq/NDF = {:5.2f}.".format(chisq, redchisq)) # # Compute covariance jMat = out.jac jMat2 = np.dot(jMat.T, jMat) detJmat2 = np.linalg.det(jMat2) # # Check whether cavariance matrix can be calculated if detJmat2 < 1E-32: # # It can't, print out values of parameters without errors print("Value of determinat detJmat2",detJmat2) print("Matrix singular, error calculation failed.") print(" ") print("Parameters returned by fit:") print("N = {:5.2f}".format(N)) print("mu = {:5.2f}".format(mu)) print("sigma = {:5.2f}".format(sigma)) print("a = {:5.2f}".format(a)) print("b = {:5.2f}".format(b)) print("c = {:5.2f}".format(c)) print(" ") errN = 0.0 errmu = 0.0 errsigma = 0.0 erra = 0.0 errb = 0.0 errc = 0.0 else: # # Covariance matrix is OK, calculate errors covar = np.linalg.inv(jMat2) errN = np.sqrt(covar[0, 0]) errmu = np.sqrt(covar[1, 1]) errsigma = np.sqrt(covar[2, 2]) erra = np.sqrt(covar[3, 3]) errb = np.sqrt(covar[4, 4]) errc = np.sqrt(covar[5, 5]) # print(" ") print("Parameters returned by fit:") print("N = {:5.2f} +- {:5.2f}".format(N, errN)) print("mu = {:5.2f} +- {:5.2f}".format(mu, errmu)) print("sigma = {:5.2f} +- {:5.2f}".format(sigma, errsigma)) print("a = {:5.2f} +- {:5.2f}".format(a, erra)) print("b = {:5.2f} +- {:5.2f}".format(b, errb)) print("c = {:5.2f} +- {:5.2f}".format(c, errc)) print(" ") # # Calculate fitted function values (using lots of points so we get a nice smooth curve!) nPlot = 100 xPlot = np.linspace(np.min(xData), np.max(xData), nPlot) fitPlot = fitFunc(pFinal, xPlot) # # Plot data fig = plt.figure(figsize = (8, 6)) plt.title('Data with fit') plt.xlabel('x') plt.ylabel('y') plt.errorbar(xData, yData, xerr = xError, yerr = yError, color = 'r', marker = 'o', linestyle = '', label = "Data") plt.plot(xPlot, fitPlot, color = 'b', marker = '', linestyle = '-', label = "Fit") edge = 0.1 xLow = np.min(xData) - edge*np.abs(np.max(xData) - np.min(xData)) xHigh = np.max(xData) + edge*np.abs(np.max(xData) - np.min(xData)) plt.xlim(xLow, xHigh) yLow = np.min(yData) - edge*np.abs(np.max(yData) - np.min(yData)) yHigh = np.max(yData) + edge*np.abs(np.max(yData) - np.min(yData)) plt.ylim(yLow, yHigh) plt.grid(color = 'g') plt.legend(loc = 2) plt.show()

28

Fit quality:
chisq per point =
[1.576e-01 8.464e-01 6.616e-03 1.375e-01 5.400e-01 2.452e+00 1.448e+00
3.566e-05 1.357e-01 3.286e-01 5.507e-02 3.160e-01 1.339e-02 1.226e-04
2.489e-03 6.494e-02 5.588e-02 8.146e-03 3.951e-01 4.195e-03 7.646e-01
4.358e-01 8.821e-01 1.168e-03 6.141e-03 2.350e+00 1.456e+00 2.628e-01
2.948e-02 4.114e-01]
chisq = 13.57, chisq/NDF = 0.57.
Parameters returned by fit:
N = 72.27 +- 7.10
mu = 2.19 +- 0.02
sigma = 0.19 +- 0.02
a = 2.88 +- 1.01
b = 0.72 +- 1.54
c = 0.80 +- 0.34

Read in the spreadsheet Spectrum02.xlsx. Plot the data. Try to fit it using a gaussian to describe the peak and a straight line to represent the background.

29

In [11]:

import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline from scipy.optimize import least_squares df2 = pd.read_excel("Spectrum02.xlsx", index_col = 0) df2 #correct format for data xData2 = np.array(df2.xData) xError2 = np.array(df2.xError) yData2 = np.array(df2.yData) yError2 = np.array(df2.yError) #plot data plt.figure(figsize = (8, 6)) plt.errorbar(xData2, yData2, xerr = xError2, yerr = yError2, color = 'r', marker = 'o', linestyle = '') plt.title('Data from DataFrame2') plt.xlabel('x') plt.ylabel('y') plt.grid(color = 'g') plt.savefig('Spect02.png') plt.show() #define functions def fitFunc(p, x): ''' Gaussian plus polynomial (order 2) ''' f2 = p[0]/(np.sqrt(2*np.pi)*p[2])*np.exp(-(x - p[1])**2/(2*p[2]**2)) + p[3] + p[4]*x + p[5]*x**2 return f2 # 1)Calculates derivative with respect to x ----------------------------------------------------- def fitFuncDiff(p, x): ''' Differential of fit function ''' df2 = p[0]/(np.sqrt(2*np.pi)*p[2])*(x - p[1])/p[2]**2*np.exp(-(x - p[1])**2/(2*p[2]**2)) + p[4] + 2*p[5]*x return df2 # 2)Calculates Chi ----------------------------------------------------------------------------- def fitChi(p, x, y, xerr, yerr): ''' Calculation of chi for fit function and data ''' chi = (y - fitFunc(p, x))/(np.sqrt(yerr**2 + fitFuncDiff(p, x)**2*xerr**2)) return chi # Set initial values of fit parameters nParams = 6 pInit = [50.0, 2.0, 0.2, 1.0, 1.0, 1.0] #lBounds = [-np.inf, -np.inf, -np.inf, -np.inf, -np.inf, -np.inf] #uBounds = [np.inf, np.inf, np.inf, np.inf, np.inf, np.inf] out = least_squares(fitChi, pInit, args = (xData2, yData2, xError2, yError2))

30

In [14]:

#fit function and display function with data fitOK = out.success # # Test if fit failed if not fitOK: print(" ") print("Fit failed") else: # # Fit worked, so get output pFinal = out.x N = pFinal[0] mu = pFinal[1] sigma = pFinal[2] a = pFinal[3] b = pFinal[4] c = pFinal[5] # # Calculate chis**2 per point, summed chi**2 and chi**2/NDF chisqArr = fitChi(pFinal, xData2, yData2, xError2, yError2)**2 chisq = np.sum(chisqArr) nPoints = len(xData) NDF = nPoints - nParams redchisq = chisq/NDF # np.set_printoptions(precision = 3) print(" ") print("Fit quality:") print("chisq per point = \n",chisqArr) print("chisq = {:5.2f}, chisq/NDF = {:5.2f}.".format(chisq, redchisq)) # # Compute covariance jMat = out.jac jMat2 = np.dot(jMat.T, jMat) detJmat2 = np.linalg.det(jMat2) # # Check whether cavariance matrix can be calculated if detJmat2 < 1E-32: # # It can't, print out values of parameters without errors print("Value of determinat detJmat2",detJmat2) print("Matrix singular, error calculation failed.") print(" ") print("Parameters returned by fit:") print("N = {:5.2f}".format(N)) print("mu = {:5.2f}".format(mu)) print("sigma = {:5.2f}".format(sigma)) print("a = {:5.2f}".format(a)) print("b = {:5.2f}".format(b)) print("c = {:5.2f}".format(c)) print(" ") errN = 0.0 errmu = 0.0 errsigma = 0.0 erra = 0.0 errb = 0.0 errc = 0.0 else: # # Covariance matrix is OK, calculate errors covar = np.linalg.inv(jMat2) errN = np.sqrt(covar[0, 0]) errmu = np.sqrt(covar[1, 1]) errsigma = np.sqrt(covar[2, 2]) erra = np.sqrt(covar[3, 3]) errb = np.sqrt(covar[4, 4]) errc = np.sqrt(covar[5, 5]) # print(" ") print("Parameters returned by fit:") print("N = {:5.2f} +- {:5.2f}".format(N, errN)) print("mu = {:5.2f} +- {:5.2f}".format(mu, errmu)) print("sigma = {:5.2f} +- {:5.2f}".format(sigma, errsigma)) print("a = {:5.2f} +- {:5.2f}".format(a, erra)) print("b = {:5.2f} +- {:5.2f}".format(b, errb)) print("c = {:5.2f} +- {:5.2f}".format(c, errc)) print(" ") # # Calculate fitted function values (using lots of points so we get a nice smooth curve!) nPlot = 150 xPlot = np.linspace(np.min(xData2), np.max(xData2), nPlot) fitPlot = fitFunc(pFinal, xPlot) # # Plot data fig = plt.figure(figsize = (8, 6)) plt.title('Data with fit') plt.xlabel('x') plt.ylabel('y') plt.errorbar(xData2, yData2, xerr = xError2, yerr = yError2, color = 'r', marker = 'o', linestyle = '', label = "Data") plt.plot(xPlot, fitPlot, color = 'b', marker = '', linestyle = '-', label = "Fit") edge = 0.1 xLow = np.min(xData2) - edge*np.abs(np.max(xData2) - np.min(xData2)) xHigh = np.max(xData2) + edge*np.abs(np.max(xData2) - np.min(xData2)) plt.xlim(xLow, xHigh) yLow = np.min(yData2) - edge*np.abs(np.max(yData2) - np.min(yData2)) yHigh = np.max(yData2) + edge*np.abs(np.max(yData2) - np.min(yData2)) plt.ylim(yLow, yHigh) plt.grid(color = 'g') plt.legend(loc = 2) plt.show()

31

Fit quality:
chisq per point =
[8.549e-02 5.061e-04 7.901e-01 3.892e-01 1.487e+00 2.261e+00 2.801e+00
8.833e-01 8.478e-02 1.316e-04 5.157e-02 6.244e-04 5.941e-01 7.644e-03
2.395e-01 9.467e-02 6.837e-02 3.156e-01 6.410e-02 6.429e+00 1.245e+00
1.500e+00 1.435e-01 4.225e+00 7.252e-01]
chisq = 24.49, chisq/NDF = 1.02.
Parameters returned by fit:
N = -105.86 +- 8.56
mu = 3.20 +- 0.03
sigma = -0.28 +- 0.02
a = 5.67 +- 2.98
b = -0.35 +- 2.07
c = 0.30 +- 0.27

That's the end of Week 1 for Phys205. You will have a chance to practice using the plotting and fitting routines we have revised here in Practical Physics II (Phys 206)!

32