CoCalc -- slr_attempt

Jupyter notebook 341fa16/slr_intro_descriptive_stats/slr_attempt_import.ipynb

Project: Joanna Funkhouser - 341fa16

Path: 341fa16/slr_intro_descriptive_stats/slr_attempt_import.ipynb

Views: ⁴⁰⁶

Kernel:

In [4]:

import csv

In [5]:

f = open("slr_sla_gbl_free_txj1j2_90.csv")
csv_f = csv.reader(f)
#for row in csv_f:
   # print (row)

https://newcircle.com/s/post/1572/python_for_beginners_reading_and_manipulating_csv_files#opening-a-csv-file

In [6]:

#data = [[row[0], eval(row[12]), eval(row[26])] for row in csv_f]
#print (data)

In [2]:

import pandas as pd
import numpy as np

In [2]:

#df = pd.read_csv("slr_sla_gbl_free_txj1j2_90.csv", error_bad_lines=False)
#df

In [3]:

data = pd.read_csv('slr_sla_gbl_free_txj1j2_90.csv', skiprows=5, index_col = 0)
#data

http://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data

http://chrisalbon.com/python/pandas_dataframe_importing_csv.html

In [16]:

data.ix[1993.0123:2015.8995, ['TOPEX/Poseidon']]

In [14]:

data['TOPEX/Poseidon'].median()

-1.54

In [15]:

data['TOPEX/Poseidon'].mean()

-1.281545454545453

http://stackoverflow.com/questions/29778636/median-of-pandas-dataframe

In [16]:

data['TOPEX/Poseidon'].std()

11.788058796093676

In [24]:

data['TOPEX/Poseidon'].max() #6

24.120000000000001

In [25]:

data['TOPEX/Poseidon'].min() #Level started negative and has risen slowly over the years

-26.77

In [15]:

data['TOPEX/Poseidon'].count() #4

440

In [17]:

data.ix[945:947, ['Jason-1']]

	Jason-1
945	39.96
946	NaN
947	43.47

In [4]:

data['Jason-1'].dropna()

In [18]:

data['Jason-1'].median()

21.3

In [19]:

data['Jason-1'].mean()

21.87150485436892

In [20]:

data['Jason-1'].std()

9.17064857558141

In [26]:

data['Jason-1'].max() #6

43.840000000000003

In [27]:

data['Jason-1'].min() #slow increase in values

-3.3700000000000001

In [14]:

data['Jason-1'].count() #4

412

In [21]:

data['Jason-2'].median()

34.32

In [22]:

data['Jason-2'].mean()

34.92374531835205

In [23]:

data['Jason-2'].std()

8.815705537014296

In [28]:

data['Jason-2'].max() #6

55.890000000000001

In [29]:

data['Jason-2'].min() #Started around 20, slowly over years increased to 50s

19.969999999999999

In [13]:

data['Jason-2'].count() #4

267

In [5]:

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1062 entries, 0 to 1061
Data columns (total 4 columns):
year              1062 non-null float64
TOPEX/Poseidon    440 non-null float64
Jason-1           412 non-null float64
Jason-2           267 non-null float64
dtypes: float64(4)
memory usage: 33.3 KB

1)Purpose of measurements, why millions were spent on them:

TOPEX/Poseidon:

Mapping ocean surface topography, looking at sea level

"TOPEX/Poseidon' radar altimeter provided the first continuous global coverage of the surface topography of the oceans."

"TOPEX/Poseidon provided measurements of the surface height of 95 percent of the ice-free ocean to an accuracy of 3.3 centimeters."

Ocean circulation patterns, and climate can also be better understood by this.

Jason-1:

Continuation of researching ocean topography...So is Jason-2

Methods

Altimeter= "An altimeter or an altitude meter is an instrument used to measure the altitude of an object above a fixed level."

https://en.wikipedia.org/wiki/Altimeter

Uses radar to get distance

In [3]:

data.describe()

/projects/anaconda3/lib/python3.5/site-packages/numpy/lib/function_base.py:3834: RuntimeWarning: Invalid value encountered in percentile
  RuntimeWarning)

	TOPEX/Poseidon	Jason-1	Jason-2
count	440.000000	412.000000	267.000000
mean	-1.281545	21.871505	34.923745
std	11.788059	9.170649	8.815706
min	-26.770000	-3.370000	19.970000
25%	NaN	NaN	NaN
50%	NaN	NaN	NaN
75%	NaN	NaN	NaN
max	24.120000	43.840000	55.890000

In [8]:

jason1 = data['Jason-1'].dropna()
#jason1

In [38]:

j1meansum = 0.0;
j1count = 0;

In [31]:

print (j1meansum)

0.0

In [41]:

j1meansum = 0.0;
j1count = 0;
for x in jason1:
    j1meansum += x
    j1count +=1
print(j1meansum)
print(j1count)

9011.06
412

In [42]:

j1mean = j1meansum / j1count
print (j1mean)

21.8715048544

In [45]:

j1sum = 0.0;
for x in jason1:
    j1sum = float(j1sum) + float((x-j1mean)**2)
print (j1sum)

34565.42686699028

In [50]:

j1std = float(np.sqrt(j1sum/j1count))
j1std

9.159512386196605

In [4]:

jason1 = data['Jason-1'].dropna()
maxlist = -1000
minlist = 1000000
j1count = 0
#orderlist = []
for x in jason1:
    if (x > maxlist):
        maxlist = x
        #orderlist.append(x)
    if (x < minlist):
        minlist = x
        #orderlist.insert(0, x)
    x += 1
    
print (maxlist)
print (minlist)
#print (orderlist)

43.84
-3.37

In [21]:

orderlist = []
for x in jason1:
    if (orderlist == []):
        orderlist.append(x)
    else:
        for n in orderlist:
            if (x >= orderlist[n]):
                orderlist.insert(n, x)
print (orderlist)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-db5cf303f9c7> in <module>()
      5     else:
      6         for n in orderlist:
----> 7             if (x >= orderlist[n]):
      8                 orderlist.insert(n, x)
      9 print (orderlist)
TypeError: list indices must be integers or slices, not numpy.float64

In [15]:

data['Jason-1'][2015.8995]

nan

In [18]:

jason1.head()

year
3300   -3.37
0537    2.42
4340    3.85
4611    5.05
1898    5.32
Name: Jason-1, dtype: float64

orderlist = jason1.sort() print(orderlist)

In [0]:

orderlist = d

http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/ifstatements.html

In [83]:

def Jomeanstd(name):
    "return standard deviation and mean of column 'name'"
    
    numbers = data[name].dropna() ##getting rid of naN
    meansum = 0.0;
    countlist = 0;
    sumlist = 0.0;
    for x in numbers: ## counting how many are in list and adding up all values
        meansum += x
        countlist += 1
    meanlist = meansum/countlist ##gives the mean of the list
    #print (name, " mean:", meanlist)
    for x in numbers:  ###adding up the sum of the squares of x_1-mean
        sumlist = float(sumlist) + float((x-meanlist)**2)
    stdlist = float(np.sqrt(sumlist/countlist))
    print (name, " mean:", meanlist)
    print (name, " std:", stdlist)
    return (meanlist, stdlist)

In [86]:

ToPomean, ToPostd = Jomeanstd('TOPEX/Poseidon')
J1mean, J1std = Jomeanstd('Jason-1')
J2mean, J2std = Jomeanstd('Jason-2')

TOPEX/Poseidon  mean: -1.28154545455
TOPEX/Poseidon  std: 11.774655654981524
Jason-1  mean: 21.8715048544
Jason-1  std: 9.159512386196605
Jason-2  mean: 34.9237453184
Jason-2  std: 8.799181238443293

In [87]:

J1mean*2

43.743009708737837

In [0]:

https://explorable.com/calculate-standard-deviation

https://wiki.python.org/moin/ForLoop

http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/functions.html

Units are mm and years?

Data is arranged in a csv, with commas being the delimiter separating the values. There is also a rather large header.

To make a clear concise form, put the data into a table, probably using pandas

What is the standard?

In [0]: