In the last exercise, you learned how to use matplotlib
to visualize data. In this exercise, we will put it all together:
We're going to use a new python package called xarray
. It is a high level interface for netCDF file reading and writing, that will make your life easier. Let's go!
%pylab inline
import xarray as xr
Did you get this this error?
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-c53dcc25b060> in <module>()
----> 1 import xarray
ImportError: No module named xarray
Phooey! What do we do now?
Well, we can install a new package using pip
, the python package manager. Many packages can be installed in this way.
Open a terminal and enter
pip install xarray --user --upgrade
When it is done, restart the kernel and try again.
import xarray as xr
Ahhhhhhhh....all better. Let's get to work!
It can't be much easier than this...xarray handles a lot of the dirty work for you. We can load both local files, as well as files on the internet. Either give the local file path, or the web site!
nc=xr.open_dataset('https://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/gistemp/combined/250km/air.2x2.250.mon.anom.comb.nc')
nc
ncvar = nc['air']
ncvar
This data is a gridded time series at 2 degree resolution of monthly surface temperature anomalies starting in 1880. Let's average over all space dimensions (lat - axis 1, and lon - axis 2). We can use np.mean
and its axis
keyword (very handy) for this purpose.
This will yield a time series of globally averaged temperature!
%pylab inline
plt.plot(ncvar.time,np.mean(ncvar,axis=(1,2)))
plt.xlabel('Year')
plt.ylabel('Temperature anomaly deg C')
plt.title("It's getting hot up in here!")
We can use the select tool to get a subset in a box (find closest index values of lon and lat) so that we can subset the data and grab the closest point to Champaign-Urbana. We can give it a list of points. Here we will give it one.
nc_cmi=nc.sel(lon=-88.9+360.,lat=40., method='nearest')
nc_cmi
import numpy as np
import matplotlib.pyplot as plt
var='air'
plt.plot(nc_cmi.time,nc_cmi[var])
plt.xlabel('Year')
plt.ylabel('Temperature anomaly deg C')
plt.title("It's not getting as hot in Champaign")
Want to save the file as a netCDF file? No problem!
nc_cmi.to_netcdf('nc_cmi.nc')
Let's say we want to average the monthly time series data into annually averaged data.
There are a number of ways to do this. xarray
offers time sampling capabilities, similar to pandas
. First though, let's do it the hard way.
nc_cmi
# How many years do we have?
ntimes=np.shape(nc_cmi['time'])
print(ntimes[0]/12.)
It looks like we don't have an evenly divisible number of months, so we don't have complete years. Let's start at the beginning and loop by 12.
nyears=np.floor(ntimes[0]/12.)
averages=np.zeros(nyears)
for i in np.arange(nyears):
averages[i]=np.mean(nc_cmi[var][i*12:(i+1)*12-1])
Save time series to a new file, just Champaign.
site='Champaign_annualavgs'
nc_cmi.to_netcdf(site+'_data.nc')
Read it back in to check!
nc_cmi2 = xr.open_dataset('Champaign_annualavgs_data.nc')
nc_cmi2
Now the easy way. We xarray
and pandas
share the same interface to resample and group time series conveniently. The documentation is available at: http://xarray.pydata.org/en/stable/time-series.html#resampling-and-grouped-operations. The codes for resampling are the same as pandas
. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
nc_cmi3=nc_cmi.resample('AS', dim='time', how='mean')
nc_cmi3
How easy is that? np.pi()? You can also resample with other time frequencies, or in space, or change how you do the calculation (i.e., calculate the median instead of the mean).
Now save to a file:
site='Champaign_annualavgs'
nc_cmi3.to_netcdf(site+'_data.nc')
Make a (nice) plot!
plt.figure(figsize=(11,8.5)) #create a new figure
plt.plot(nc_cmi['time'],np.squeeze(nc_cmi['air']),'b',alpha=0.5)
plt.plot(nc_cmi3['time'],np.squeeze(nc_cmi3['air']),'r',linewidth=2.0)
plt.legend(['Monthly averages','Annual averages'])
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly (degrees C)')
plt.title('GISTEMP Temperature Anomalies near Champaign, IL')
plt.show()
basemap
.¶Python's basemap
is a package that is designed to enable matplotlib
handle geographic calculations and visualize maps with geographic boundaries. All we need is an array with a 2-D variable, and associated latitudes and longitudes, and off we go.
import pandas as pd
from mpl_toolkits.basemap import Basemap
#Select January 2017, make lats and lons 2D so basemap knows where to plot each point
jan2017 = nc.sel(time=pd.datetime(1976,9,1))
#Clean up data masking - this will maske where data is nan, and put the good values in there
data = np.ma.masked_where(np.isnan(jan2017['air'].values),jan2017['air'].values)
lons,lats=np.meshgrid(jan2017['lon'].values,jan2017['lat'].values)
# create figure, axes instances.
fig = plt.figure()
ax = fig.add_axes([0.05,0.05,0.9,0.9])
# create Basemap instance.
# coastlines not used, so resolution set to None to skip
# continent processing (this speeds things up a bit)
m = Basemap(projection='kav7',lon_0=0,resolution='l')
# draw line around map projection limb.
# color background of map projection region.
# missing values over land will show up this color.
m.drawmapboundary(fill_color='0.3')
# plot temperature with pcolor
im1 = m.pcolormesh(lons,lats,data,latlon=True,vmin=-10,vmax=10)
# draw parallels and meridians, but don't bother labelling them.
m.drawparallels(np.arange(-90.,99.,30.))
m.drawmeridians(np.arange(-180.,180.,60.))
#Add coastlines
m.drawcoastlines()
# add colorbar
cb = m.colorbar(im1,"bottom", size="5%", pad="2%")
cb.set_label('Temperature anomaly (deg C)')
# add a title.
ax.set_title('GISS Temperature Anomalies for Jan 2017')
plt.show()
Zoom in, use a different map projection.
from mpl_toolkits.basemap import Basemap
import numpy as np
import matplotlib.pyplot as plt
# create new figure, axes instances.
fig=plt.figure()
ax=fig.add_axes([0.1,0.1,0.8,0.8])
# setup mercator map projection.
m = Basemap(llcrnrlon=-130.,llcrnrlat=20.,urcrnrlon=-55.,urcrnrlat=60.,\
rsphere=(6378137.00,6356752.3142),\
resolution='l',projection='merc',\
lat_0=40.,lon_0=-20.,lat_ts=20.)
im1 = m.pcolormesh(lons,lats,data,latlon=True,vmin=-10,vmax=10)
m.drawcoastlines()
m.drawcountries(color='k',linewidth=2,linestyle='--')
# draw parallels
m.drawparallels(np.arange(10,90,20),labels=[1,1,0,1])
# draw meridians
m.drawmeridians(np.arange(-180,180,30),labels=[1,1,0,1])
cb = m.colorbar(im1,"bottom", size="5%", pad="10%")
cb.set_label('Temperature anomaly (deg C)')
ax.set_title('GISS Temperature Anomalies for Jan 2017')
plt.show()