Project 3: GDP and life expectancy
produced by Jez Phipps on 30th October 2016.
This is the project notebook for Week 3 of The Open University's Learn to code for Data Analysis course.
Richer countries can afford to invest more on healthcare, on work and road safety, and other measures that reduce mortality. On the other hand, richer countries may have less healthy lifestyles. Is there any relation between the wealth of a country and the life expectancy of its inhabitants?
The following analysis checks whether there is any correlation between the total gross domestic product (GDP) of a country in 2013 and the life expectancy of people born in that country in 2013.
The project has also been extended to answer some key questions including those concerning whether GDP per capita has greater bearing on life expectancy than GDP.
Option to toggle code visibility on/off is at bottom of page.
Getting the data
Two datasets of the World Bank are considered. One dataset, available at http://data.worldbank.org/indicator/NY.GDP.MKTP.CD, lists the GDP of the world's countries in current US dollars, for various years. The use of a common currency allows us to compare GDP values across countries. The other dataset, available at http://data.worldbank.org/indicator/SP.DYN.LE00.IN, lists the life expectancy of the world's countries. The datasets were downloaded as CSV files in March 2016.
country | year | SP.DYN.LE00.IN | |
---|---|---|---|
0 | Arab World | 2013 | 70.631305 |
1 | Caribbean small states | 2013 | 71.901964 |
2 | Central Europe and the Baltics | 2013 | 76.127583 |
3 | East Asia & Pacific (all income levels) | 2013 | 74.604619 |
4 | East Asia & Pacific (developing only) | 2013 | 73.657617 |
Cleaning the data
Inspecting the data with head()
and tail()
shows that:
the first 34 rows are aggregated data, for the Arab World, the Caribbean small states, and other country groups used by the World Bank;
GDP and life expectancy values are missing for some countries.
The data is therefore cleaned by:
removing the first 34 rows;
removing rows with unavailable values.
Transforming the data
The World Bank reports GDP in US dollars and cents. To make the data easier to read, the GDP is converted to millions of British pounds (the author's local currency) with the following auxiliary functions, using the average 2013 dollar-to-pound conversion rate provided by http://www.ukforex.co.uk/forex-tools/historical-rate-tools/yearly-average-rates.
country | year | NY.GDP.MKTP.CD | GDP (£m) | |
---|---|---|---|---|
34 | Afghanistan | 2013 | 2.045894e+10 | 13075 |
35 | Albania | 2013 | 1.278103e+10 | 8168 |
36 | Algeria | 2013 | 2.097035e+11 | 134016 |
38 | Andorra | 2013 | 3.249101e+09 | 2076 |
39 | Angola | 2013 | 1.383568e+11 | 88420 |
The unnecessary columns can be dropped.
country | GDP (£m) | |
---|---|---|
34 | Afghanistan | 13075 |
35 | Albania | 8168 |
36 | Algeria | 134016 |
38 | Andorra | 2076 |
39 | Angola | 88420 |
The World Bank reports the life expectancy with several decimal places. After rounding, the original column is discarded.
country | Life expectancy (years) | |
---|---|---|
34 | Afghanistan | 60 |
35 | Albania | 78 |
36 | Algeria | 75 |
39 | Angola | 52 |
40 | Antigua and Barbuda | 76 |
Combining the data
The tables are combined through an inner join on the common 'country' column.
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
0 | Afghanistan | 13075 | 60 |
1 | Albania | 8168 | 78 |
2 | Algeria | 134016 | 75 |
3 | Angola | 88420 | 52 |
4 | Antigua and Barbuda | 767 | 76 |
Calculating the correlation
To measure if the life expectancy and the GDP grow together, the Spearman rank correlation coefficient is used. It is a number from -1 (perfect inverse rank correlation: if one indicator increases, the other decreases) to 1 (perfect direct rank correlation: if one indicator increases, so does the other), with 0 meaning there is no rank correlation. A perfect correlation doesn't imply any cause-effect relation between the two indicators. A p-value below 0.05 means the correlation is statistically significant.
The value shows a direct correlation, i.e. richer countries tend to have longer life expectancy, but it is not very strong.
Showing the data
Measures of correlation can be misleading, so it is best to see the overall picture with a scatterplot. The GDP axis uses a logarithmic scale to better display the vast range of GDP values, from a few million to several billion (million of million) pounds.
The plot shows there is no clear correlation: there are rich countries with low life expectancy, poor countries with high expectancy, and countries with around 10 thousand (104) million pounds GDP have almost the full range of values, from below 50 to over 80 years. Towards the lower and higher end of GDP, the variation diminishes. Above 40 thousand million pounds of GDP (3rd tick mark to the right of 104), most countries have an expectancy of 70 years or more, whilst below that threshold most countries' life expectancy is below 70 years.
Comparing the 10 poorest countries and the 10 countries with the lowest life expectancy shows that total GDP is a rather crude measure. The population size should be taken into account for a more precise definiton of what 'poor' and 'rich' means. Furthermore, looking at the countries below, droughts and internal conflicts may also play a role in life expectancy.
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
87 | Kiribati | 108 | 66 |
141 | Sao Tome and Principe | 195 | 66 |
111 | Micronesia, Fed. Sts. | 202 | 69 |
168 | Tonga | 277 | 73 |
37 | Comoros | 383 | 63 |
157 | St. Vincent and the Grenadines | 461 | 73 |
140 | Samoa | 509 | 73 |
180 | Vanuatu | 512 | 72 |
65 | Grenada | 538 | 73 |
60 | Gambia, The | 578 | 60 |
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
177 | United States | 10715999 | 79 |
35 | China | 6065182 | 75 |
83 | Japan | 3143957 | 83 |
62 | Germany | 2393529 | 81 |
58 | France | 1795953 | 82 |
176 | United Kingdom | 1733354 | 81 |
23 | Brazil | 1528714 | 74 |
81 | Italy | 1363486 | 82 |
138 | Russian Federation | 1328647 | 71 |
75 | India | 1189826 | 68 |
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
95 | Lesotho | 1418 | 49 |
160 | Swaziland | 2916 | 49 |
32 | Central African Republic | 983 | 50 |
146 | Sierra Leone | 3092 | 50 |
33 | Chad | 8276 | 51 |
41 | Cote d'Ivoire | 19998 | 51 |
3 | Angola | 88420 | 52 |
124 | Nigeria | 329100 | 52 |
30 | Cameroon | 18896 | 55 |
153 | South Sudan | 8473 | 55 |
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
0 | Japan | 3143957 | 83 |
1 | France | 1795953 | 82 |
Which are the two countries in the right half of the plot (higher GDP) with life expectancy below 60 years? What factors could explain their lower life expectancy compared to countries with similar GDP? Hint: use the filtering techniques you learned in Week 2 to find the two countries.
country | GDP (£m) | Life expectancy (years) | |
---|---|---|---|
124 | Nigeria | 329100 | 52 |
152 | South Africa | 234056 | 57 |
The two countries with higher GDP (i.e. above £105 million) but a life expectancy below 60 are Nigeria and South Africa.
Redo the analysis using the countries’ GDP per capita (i.e. per inhabitant) instead of their total GDP. If you’ve done the workbook exercises, you already have a column with the population data. Hint: write an expression involving the GDP and population columns, as you learned in Calculating over columns in Week 1. Think about the units in which you display GDP per capita.
country | GDP (£) | |
---|---|---|
34 | Afghanistan | 1.307474e+10 |
35 | Albania | 8.168003e+09 |
36 | Algeria | 1.340157e+11 |
38 | Andorra | 2.076410e+09 |
39 | Angola | 8.842001e+10 |
country | SP.POP.TOTL | |
---|---|---|
34 | Afghanistan | 30682500 |
35 | Albania | 2897366 |
36 | Algeria | 38186135 |
37 | American Samoa | 55302 |
38 | Andorra | 75902 |
country | GDP (£) | SP.POP.TOTL | |
---|---|---|---|
0 | Afghanistan | 1.307474e+10 | 30682500 |
1 | Albania | 8.168003e+09 | 2897366 |
2 | Algeria | 1.340157e+11 | 38186135 |
3 | Andorra | 2.076410e+09 | 75902 |
4 | Angola | 8.842001e+10 | 23448202 |
country | GDP per capita (£) | |
---|---|---|
0 | Afghanistan | 426.13 |
1 | Albania | 2819.11 |
2 | Algeria | 3509.54 |
3 | Andorra | 27356.47 |
4 | Angola | 3770.87 |
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
99 | Luxembourg | 72679.55 | 82 |
125 | Norway | 65717.26 | 81 |
136 | Qatar | 61400.15 | 78 |
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
32 | Central African Republic | 208.62 | 50 |
27 | Burundi | 165.75 | 56 |
103 | Malawi | 153.29 | 61 |
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
72 | Hong Kong SAR, China | 24517.50 | 84 |
74 | Iceland | 30351.62 | 83 |
162 | Switzerland | 54109.81 | 83 |
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
146 | Sierra Leone | 500.40 | 50 |
95 | Lesotho | 680.50 | 49 |
160 | Swaziland | 2331.38 | 49 |
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
176 | United Kingdom | 27038.54 | 81 |
Conclusions
From our analysis, there appears to be no strong correlation between a country's wealth and the life expectancy of its inhabitants: there is often a wide variation of life expectancy for countries with similar GDP, countries with the lowest life expectancy are not the poorest countries, and countries with the highest expectancy are not the richest countries. Nevertheless there is some relationship, because the vast majority of countries with a life expectancy below 70 years is on the left half of the scatterplot.
From the chart above, however, we can see that, generally, as GDP per capita increases so does life expectancy. Having said that, there is one country with a higher GDP per capita but a life expectancy below 60. This country is Equatorial Guinea, as shown below. This alongside the moderate correlation (0.501, p<0.001) indicates that although there is a general trend between these two variables, there are likely to be other factors involved in this relationship, which should be identified and further investigated.
country | GDP per capita (£) | Life expectancy (years) | |
---|---|---|---|
52 | Equatorial Guinea | 13738.71 | 57 |