Using agate in a Jupyter notebook
First we import agate. Then we create an agate Table by loading data from a CSV file.
from:
https://source.opennews.org/en-US/articles/introducing-agate/
Question 1: What was the total cost to Kansas City area counties?
To answer this question, we first must filter the table to only those rows which refer to a Kansas City area county
.
We can then print the Sum of the costs of all those rows. (The cost column is named total_cost
.)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-2f0757b12ce8> in <module>()
----> 1 print('$%d' % kansas_city.columns['total_cost'].aggregate(agate.Sum()))
AttributeError: 'Column' object has no attribute 'aggregate'
Question 2: Which counties spent the most?
This question is more complicated. First we group the data by county
, which gives us a TableSet named counties
. A TableSet is a group of tables with the same columns.
We then use the aggregate
function to sum the total_cost
column for each table in the group. The resulting values are collapsed into a new table, totals
, which has a row for each county and a column named total_cost_sum
containing the new total.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-2200fdc8f271> in <module>()
1 # Aggregate totals for all counties
2 totals = counties.aggregate([
----> 3 ('total_cost', agate.Sum(), 'total_cost_sum')
4 ])
5
TypeError: __init__() takes exactly 2 arguments (1 given)
Finally, we sort the counties by their total cost, limit the results to the top 10 and then print the results as a text bar chart.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-ba800fb09e95> in <module>()
----> 1 totals.order_by('total_cost_sum', reverse=True).limit(20).print_bars('county', 'total_cost_sum', width=100)
NameError: name 'totals' is not defined