**Run this code in the cell below to get started.**

```
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
#
# Now make some data
np.random.seed(20150108) # seed the random number generator so eveyone has the same data
time = np.linspace(0, 2, 101)
volts = 2.5 * np.sin(2 * np.pi * time - 1.0) + np.random.normal(0,0.1,len(time))
```

1

In [8]:

%matplotlib inline import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit # # Now make some data np.random.seed(20150108) # seed the random number generator so eveyone has the same data time = np.linspace(0, 2, 101) volts = 2.5 * np.sin(2 * np.pi * time - 1.0) + np.random.normal(0,0.1,len(time))

2

A very important data analysis skill is fitting a model to your data. Let's set up the situation with the data we simulated above. You have measured the output voltage from a system and it is delayed by some amount from your input signal which has the form $V(t) = 1.0 \sin(2 \pi t)$. You want to know the amplitude of the output and the time delay.

You almost **always** want to start your data analysis by plotting your data, so that is what you should do first in the cell below. plot the data as dots, and your input as a red line. Your plot should look like the figure below.

** Plot the data in array `volts`

and create the array for `input`

. **

3

In [6]:

4

```
File "<ipython-input-6-f3197774fb72>", line 1
volts = 1.0 \sin(2 \pi t)
^
SyntaxError: unexpected character after line continuation character
```

In [9]:

fig = plt.figure() ax = fig.add_subplot(111) pl1 = ax.plot(time, volts) ax.set_xlabel("Time (s)") ax.set_ylabel("Output (V)") ax.set_title("Fun Plot 2") ax.plot(time, volts, '.b') label = "DI" ax.legend(label)

5

<matplotlib.legend.Legend at 0x7fe90dcfdb80>

`curve_fit`

The module we are going to use to fit data to a model is called `curve_fit`

. We are going to treat it as a black box, but what it does is minimize the average vertical distance between each data point and the closest vertical point on the model function. This module is in the `optimize`

module which in turn is in the module `scipy`

.

There are several steps to using `curve_fit`

.

In our case, we know the time dependence of the output goes as $\sin(2\pi t)$. What we don't know (so what we have to fit for) is the amplitude and the starting time. Let's call the amplitude `amp`

and the start time `t0`

, so the model function we are fitting is $V_{out} = amp \sin(2 \pi (t - t_0))$. The module `curve_fit`

needs a function with independent variable as the first parameter. The next parameters, howevermany there are, are parameters to your fitting function.

Here is a skeleton of the function. In the cell below you have to modify the function to work and return an array of the model function value for each x-value.

```
def vModel(time, amp, t0):
return # put your return function in here
```

** Modify the function below to return the function above. **

6

In [22]:

def vModel(time, amp, t0): return (amp)

7

Well, our first plot above shows a reasonable starting place, but we can refine our initial guess for the amplitude by setting it to
$amp_{start} = \frac{max(data) - min(data)}{2}$
You can fund the *max* and *min* using the `numpy`

functions `np.max()`

and `np.min()`

.

For our initial `t0`

, just try `0.0`

.

Use the code below to set initial guesses.

8

In [ ]:

9

This is easy. Just run the code

```
paramsFit, paramsCovariance = \
curve_fit(vModel, time, volts, p0=[ampStart, t0Start])
```

The function `curve_fit`

does the very complicated non-linear least squares fit and returns two arrays: the best fit values of the parameters, and the *covarience* (think uncertainty) of those values. You capture those values in the variables `paramsFit`

, and `paramsCovariance`

. You should print out both of those variables.

Note the backslash continues the code to the next line.

10

In [23]:

paramsFit, paramsCovariance = \ curve_fit(vModel, time, volts, p0=[ampStart, t0Start])

11

```
---------------------------------------------------------------------------
```

```
NameError Traceback (most recent call last)
```

```
<ipython-input-23-e0fca2f48ccc> in <module>
1 paramsFit, paramsCovariance = \
----> 2 curve_fit(vModel, time, volts, p0=[ampStart, t0Start])
```

```
NameError: name 'ampStart' is not defined
```

The array `paramsFit`

are the best fit values of the parameters. I like to create new variables with these these values for use later. The line

```
(ampFit, t0Fit) = paramsFit
```

creates two new variables in one compact line of code.

A second very important result is the **uncertainty** in your fit parameters. These are returned in the attribute `paramsCovariance`

. This is more complicated since one variable can affect the other, the statistics are returned as a matrix. To simply look at the uncertainty of each variable by itself, you only need the diagonal elements and take their square roots.
You can calculate the uncertainties by the code

```
ampSD = np.sqrt(paramsCovariance[0,0])
t0SD = np.sqrt(paramsCovariance[1,1])
```

I like to clearly print these out for the use. I will capture the output in a string and print the string because I am going to use the string later in a plot. Copy and paste this code:

```
resultStr = \
r"""The amp is %.3f $\pm$ %.3f
The t0 is %.3f $\pm$ %.3f""" % (ampFit, ampSD, t0Fit, t0SD)
print(resultStr)
```

** Cut and paste the above code lines into the cell below and run the cell. **

12

In [24]:

(ampFit, t0Fit) = paramsFit ampSD = np.sqrt(paramsCovariance[0,0]) t0SD = np.sqrt(paramsCovariance[1,1]) resultStr = \ r"""The amp is %.3f $\pm$ %.3f The t0 is %.3f $\pm$ %.3f""" % (ampFit, ampSD, t0Fit, t0SD) print(resultStr)

13

```
---------------------------------------------------------------------------
```

```
NameError Traceback (most recent call last)
```

```
<ipython-input-24-de8290fb19ac> in <module>
----> 1 (ampFit, t0Fit) = paramsFit
2 ampSD = np.sqrt(paramsCovariance[0,0])
3 t0SD = np.sqrt(paramsCovariance[1,1])
4 resultStr = \
5 r"""The amp is %.3f $\pm$ %.3f
```

```
NameError: name 'paramsFit' is not defined
```

**Always plot your results!**

By now you should be able to plot the data and the model on the same figure. Here are some hints:

- The best fit values are in the variables
`ampFit`

and`t0Fit`

, so you can use that to calculate your model line. I used the call

```
vModel(time, ampFit, t0Fit)
```

right in my call to `ax.plot`

, like

```
ax1.plot(time,vModel(time, ampFit, t0Fit),'-g', label="Fit")
```

- I also used the method
`ax.text`

to write the best values right on my plot. The actual call is

```
ax.text(0.1,2.8,resultStr)
```

This is how my final plot looked:

Make your final plot in the cell below.

14

In [25]:

vModel(time, ampFit, t0Fit) ax1.plot(time,vModel(time, ampFit, t0Fit),'-g', label="Fit") ax.text(0.1,2.8,resultStr)

15

```
---------------------------------------------------------------------------
```

```
NameError Traceback (most recent call last)
```

```
<ipython-input-25-d67881358c30> in <module>
----> 1 vModel(time, ampFit, t0Fit)
2 ax1.plot(time,vModel(time, ampFit, t0Fit),'-g', label="Fit")
3 ax.text(0.1,2.8,resultStr)
```

```
NameError: name 'ampFit' is not defined
```

Some times you only have a few data points you are fitting, but you want to draw a smooth model curve with you data. In the cell below I've created an example with just a few data points and fit a model to the data. You should plot the data and model. You will see the model is a set of straight line segments between data points that do not look very good.

16

In [ ]:

17

** Now plot your data in the cell below. **

18

In [ ]:

19

That doesn't look very good, does it?

The solution to making a better plot is to make an additional x-axis variable for plotting that has enough points to create a smooth model curve. Try making an array like this

```
tPlot = np.linspace(0.0, 2.0, 501)
```

then plot the fit function using this plotting axis.

Your plot should look similar this:

20

In [26]:

tPlot = np.linspace(0.0, 2.0, 501)

21

There are many other data fitting topics I won't cover. You can Google them to find out. Here are some:

**Error Bars**Sometimes you want to plot error bars on your data- Correlated fit parameters

22

In [1]:

from IPython.core.display import HTML def css_styling(): styles = open("custom.css", "r").read() return HTML(styles) css_styling()

23