Comparing FCC & CFPB Complaint Data
by Mark Silverberg (@Skram) [email protected]
Objective
Tinker with Sage Math Cloud-powered Jupyter Python notebooks and Socrata-powered federal open data from two different organizations: Federal Communications Commission (FCC) and the Consumer Financial Protection Bureau (CFPB)
The puzzle pieces
Complaint data from FCC ("CGB - Consumer Complaints Data") and CFPB ("Consumer Complaints") (easy API access to the data, powered by Socrata)
Jupyter Notebook w/
pandas
(for the layout of this page, powered by Sage Math Cloud)Plotly (for charting)
US Census data via Sunlight Foundation's
census
python package (for state population estimates)TextBlob (for sentiment analysis)
Please improve upon this
Depending on how you are viewing this, you can download the python notebook and expand upon this work! Please feel free to raise feature requests and suggestions on Github at https://github.com/marks/open-complaints-data/
Setup
Dependencies and variables
Helper functions
Fetch the data (for each query template and provider)
Print some summary stats and raw count choropleths
Total # of Complaints per Data Source
Let's try normalizing complaint counts per 100k population
First, get state populations from the US Census
Plot normalized counts
Normalized # of Complaints per 100k population per Data Source
Check out the top states by normalized complaint count
complaints | state | complaint_rate_per_100k_pop | |
---|---|---|---|
8 | 2314 | DC | 373.604835 |
22 | 17324 | MD | 296.933702 |
6 | 11872 | CO | 231.905392 |
10 | 39859 | FL | 208.782538 |
52 | 14029 | WA | 205.716511 |
complaints | state | complaint_rate_per_100k_pop | |
---|---|---|---|
11 | 3332 | DC | 537.965129 |
12 | 3006 | DE | 330.894737 |
26 | 18256 | MD | 312.908200 |
13 | 55374 | FL | 290.050534 |
39 | 23139 | NJ | 261.978446 |