{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "# Problem description\n", "\n", "## Context\n", "Police deparments are stirving to implement more automated and predictive data systems into their everyday processes to reduce crime and deploy scarce resources more efficiently. This provides an opportunity for more proactive policing if it were possible to alert resources of abnormal patterns in the data as they occur. Boston police department released public dataset with incident reports reported to its 911 call center.\n", "\n", "## Challenge\n", "Assess the potential of the provided data set for predicting where police patrols should be dispatched in order to serve, protect, and optimize (people, money, resources, time).\n", "\n", "### Description of columns\n", "_(As provided online)_\n", "\n", "1. `incident_num` (varchar; required) - Internal BPD report number\n", "2. `offense_code` (varchar) - Numerical code of offense description\n", "3. `Offense_Code_Group_Description` (varchar) - Internal categorization of [offense_description]\n", "4. `Offense_Description` (varchar) - Primary descriptor of incident\n", "5. `district` (varchar) - What district the crime was reported in\n", "6. `reporting_area` (varchar) - RA number associated with the where the crime was reported from.\n", "7. `shooting` (char) - Indicated a shooting took place.\n", "8. `occurred_on` (datetime) - Earliest date and time the incident could have taken place\n", "9. `UCR_Part` (varchar) - Universal Crime Reporting Part number (1, 2, 3)\n", "10. `street` (varchar) - Street name the incident took place" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "We load the data in the cells below. Uncomment and run the one corresponding to the language of your choice!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "import pandas as pd\n", "reports_df = pd.read_csv('boston_crime_incident_reports_2015aug-2018apr.csv', encoding='latin-1')\n", "weather_df = pd.read_csv('boston_weather_data_cleaned_2018oct05.csv')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "# reports_df <- read.csv('boston_crime_incident_reports_2015aug-2018apr.csv', header=TRUE)\n", "# weather_df <- read.csv('boston_weather_data_cleaned_2018oct05.csv', header=TRUE)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | INCIDENT_NUMBER | \n", "OFFENSE_CODE | \n", "OFFENSE_CODE_GROUP | \n", "OFFENSE_DESCRIPTION | \n", "DISTRICT | \n", "REPORTING_AREA | \n", "SHOOTING | \n", "OCCURRED_ON_DATE | \n", "YEAR | \n", "MONTH | \n", "DAY_OF_WEEK | \n", "HOUR | \n", "UCR_PART | \n", "STREET | \n", "Lat | \n", "Long | \n", "Location | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "I182024895 | \n", "2629 | \n", "Harassment | \n", "HARASSMENT | \n", "B3 | \n", "442 | \n", "NaN | \n", "2018-04-03 20:00:00 | \n", "2018 | \n", "4 | \n", "Tuesday | \n", "20 | \n", "Part Two | \n", "WESTCOTT ST | \n", "42.293218 | \n", "-71.078865 | \n", "(42.29321805, -71.07886455) | \n", "
1 | \n", "I182024895 | \n", "619 | \n", "Larceny | \n", "LARCENY ALL OTHERS | \n", "B3 | \n", "442 | \n", "NaN | \n", "2018-04-03 20:00:00 | \n", "2018 | \n", "4 | \n", "Tuesday | \n", "20 | \n", "Part One | \n", "WESTCOTT ST | \n", "42.293218 | \n", "-71.078865 | \n", "(42.29321805, -71.07886455) | \n", "
2 | \n", "I182024887 | \n", "1402 | \n", "Vandalism | \n", "VANDALISM | \n", "B3 | \n", "469 | \n", "NaN | \n", "2018-03-28 20:30:00 | \n", "2018 | \n", "3 | \n", "Wednesday | \n", "20 | \n", "Part Two | \n", "ALMONT ST | \n", "42.275277 | \n", "-71.095542 | \n", "(42.27527670, -71.09554245) | \n", "