CoCalc Public FilesPublic / doughnut-data.sagewsOpen with one click!
Author: Hal Snyder
Views : 234
Description: Doughnut Economics exercise for US Donut Day
Compute Environment: Ubuntu 18.04 (Deprecated)

Data Analysis for Doughnut Economics

Celebrating the Doughnut. This worksheet is a calculation on U.S. National Doughnut Day, June 1, 2018, inspired by Kate Raworth's Doughnut Economics

We download tables of Gross Domestic Product (GDP) per capita and Human Development Index (HDI) for the countries of the world and observe the discrepancy in the rankings.

There are more advanced visualizations Human Development Index (HDI) and related information. For example, see Our World in Data HDI.

The purpose of this worksheet is to demonstrate basic access to the wealth of online data using open-source programming tools (Python, pandas), and to get a bit of a start at coding to explore the ideas of doughnut economics.

Disclaimer: I am not an economist. (In case it wasn't obvious.)

Feedback to Hal Snyder, [email protected]

Links

%auto %default_mode python3
import sys print(sys.version)
3.5.5 | packaged by conda-forge | (default, Feb 13 2018, 05:02:37) [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]
import pandas as pd import requests
# What version of pandas are we working with? print(pd.__version__)
0.22.0
# http://docs.python-requests.org/en/master/user/quickstart/ # Use python "requests" module to download the first spreadsheet. data_url = "http://hdr.undp.org/sites/default/files/tables_1-15_6a_dashboard1_dashboard2_online_version.xlsx" hdi_file = "hddata.xlsx" r = requests.get(data_url, stream=True) with open(hdi_file, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk)
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html # Fetch Excel file with lots of HDI data. # We only need the spreadsheet labeled "Table 2". x = pd.read_excel(hdi_file, sheet_name="Table 2", index_col=None, usecols=[1,16], header=None, names=["country", "HDI2015"], skiprows=6, skipfooter=240-197) x.shape
(191, 2)
# Look at first 3 rows to see what we have. x.head(3)
country HDI2015
0 Norway 0.949423
1 Australia 0.938680
2 Switzerland 0.939131
# get per capita GDP for 2015, the most recent year with HDI listed per Table 2 above # these numbers are at current prices - US dollars # http://data.un.org/Data.aspx?q=GDP+per+capita&d=SNAAMA&f=grID%3a101%3bcurrID%3aUSD%3bpcFlag%3a1 # export to .csv and upload here to "gdppc2015.csv" gdp_file = "gdppc2015.csv" y = pd.read_csv(gdp_file, usecols=[0,3], names=["country","GDPPC2015"], header=0) y.shape
(212, 2)
# Have a look at the first 3 rows. y.head(3)
country GDPPC2015
0 Afghanistan 610.854517
1 Albania 3877.487975
2 Algeria 4160.218690
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html # Do inner join of the two tables on country name. # This will drop countries that appear in only one of the tables. z = x.merge(y, left_on='country', right_on='country').set_index("country") z.shape
(179, 2)
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.scatter.html # Scatter plot shows that HDI and GDP per capita are correlated, but the relationship # is only vaguely monotonic, suggesting that high HDI is possible without commensurate # per-capita GDP. z.plot.scatter("HDI2015", "GDPPC2015", s=3);
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rank.html # Create a new dataframe of ranks for each column, # i.e. in column "HDI_rank", a value of 1.0 indicates the country with largest HDI # 2.0 indicates the country with second-largest HDI, etc. zr = z.rank(ascending=False) zr.columns = ["HDI_rank","GDP_rank"] zr.head(3)
HDI_rank GDP_rank
country
Norway 1.0 4.0
Australia 3.0 10.0
Switzerland 2.0 3.0
# Merge dataframe of values with dataframe of ranks. zranked = z.merge(zr, left_index=True, right_index=True, sort=True) zranked.head(3)
HDI2015 GDPPC2015 HDI_rank GDP_rank
country
Afghanistan 0.479375 610.854517 161.0 167.0
Albania 0.764174 3877.487975 71.0 107.0
Algeria 0.744806 4160.218690 79.0 97.0
# Sort by Human Development Index from greatest to least # and display the top 15. # Observe that GDP per capita rank is quite different from HDI rank # for example, I live in the US, which ranks #7 in GDP per capita, # but only #11 in HDI. z_by_hdi = zranked.sort_values(["HDI2015"], ascending=False) z_by_hdi.head(15)
HDI2015 GDPPC2015 HDI_rank GDP_rank
country
Norway 0.949423 74360.641259 1.0 4.0
Switzerland 0.939131 81647.599454 2.0 3.0
Australia 0.938680 52237.928546 3.0 10.0
Germany 0.925669 41313.210675 4.0 17.0
Singapore 0.924866 53626.243222 5.0 8.0
Denmark 0.924649 52964.425912 6.0 9.0
Netherlands 0.924312 44750.095821 7.0 13.0
Ireland 0.922746 61831.997993 8.0 6.0
Iceland 0.921110 51241.827451 9.0 11.0
Canada 0.920284 43193.886549 10.0 15.0
United States 0.919553 56639.769525 11.0 7.0
New Zealand 0.914900 38491.649589 12.0 20.0
Sweden 0.912706 50997.572024 13.0 12.0
Liechtenstein 0.911517 168177.258828 14.0 1.0
Japan 0.903462 34224.423817 15.0 24.0

End of worksheet