First we import agate. Then we create an agate Table by loading data from a CSV file.
from:
https://source.opennews.org/en-US/articles/introducing-agate/
import agate
#table = agate.Table.from_csv('examples/realdata/ks_1033_data.csv')
table = agate.Table.from_csv('ks_1033_data.csv')
print(table)
To answer this question, we first must filter the table to only those rows which refer to a Kansas City area county
.
kansas_city = table.where(lambda r: r['county'] in ('JACKSON', 'CLAY', 'CASS', 'PLATTE'))
print(len(table.rows))
print(len(kansas_city.rows))
We can then print the Sum of the costs of all those rows. (The cost column is named total_cost
.)
print('$%d' % kansas_city.columns['total_cost'].aggregate(agate.Sum()))
This question is more complicated. First we group the data by county
, which gives us a TableSet named counties
. A TableSet is a group of tables with the same columns.
# Group by county
counties = table.group_by('county')
print(counties.keys())
We then use the aggregate
function to sum the total_cost
column for each table in the group. The resulting values are collapsed into a new table, totals
, which has a row for each county and a column named total_cost_sum
containing the new total.
# Aggregate totals for all counties
totals = counties.aggregate([
('total_cost', agate.Sum(), 'total_cost_sum')
])
print(totals.column_names)
Finally, we sort the counties by their total cost, limit the results to the top 10 and then print the results as a text bar chart.
totals.order_by('total_cost_sum', reverse=True).limit(20).print_bars('county', 'total_cost_sum', width=100)
import agate
purchases = agate.Table.from_csv('ks_1033_data.csv')
by_county = purchases.group_by('county')
totals = by_county.aggregate([
('county_cost', agate.Sum('total_cost'))
])
totals = totals.order_by('county_cost', reverse=True)
totals.limit(10).print_bars('county', 'county_cost', width=80)
# https://source.opennews.org/en-US/articles/introducing-agate/