| Download
A Python notebook on Data Breaches
Project: South African Cyber Security
Path: databreach.ipynb
Views: 186License: APACHE
Image: default
Kernel: Python 3 (system-wide)
In [1]:
In [4]:
In [5]:
Entity | Story | Year | Records | Sector | Method | |
---|---|---|---|---|---|---|
0 | River City Media | A dodgy backup has allegedly resulted in over ... | 2017 | 1370000000 | Web | Accidentally published |
1 | Unique Identification Authority of India | A report says that full data base has been exp... | 2017 | 1000000000 | Government | Poor security |
2 | Spambot | A misconfigured spambot has leaked over 700m r... | 2017 | 711000000 | Web | Poor security |
3 | Friend Finder Network | Usernames, email addresses, passwords for site... | 2016 | 412000000 | Web | Hacked |
4 | Equifax | If you have a credit report, there’s a good ch... | 2017 | 143000000 | Financial | Hacked |
... | ... | ... | ... | ... | ... | ... |
265 | Cardsystems Solutions Inc. | CardSystems was fingered by MasterCard after i... | 2005 | 40000000 | Financial | Hacked |
266 | Citigroup | Blame the messenger! A box of computer tapes c... | 2005 | 3900000 | Financial | Lost / stolen device or media |
267 | Ameritrade Inc. | Computer backup tape containing personal infor... | 2005 | 200000 | Financial | Lost / stolen device or media |
268 | Automatic Data Processing | NaN | 2005 | 125000 | Financial | Poor security |
269 | AOL | A former America Online software engineer stol... | 2004 | 92000000 | Web | Inside job |
270 rows × 6 columns
In [6]:
(270, 6)
In [7]:
Entity object
Story object
Year int64
Records int64
Sector object
Method object
dtype: object
In [8]:
Entity | Story | Year | Records | Sector | Method | |
---|---|---|---|---|---|---|
0 | River City Media | A dodgy backup has allegedly resulted in over ... | 2017 | 1370000000 | Web | Accidentally published |
1 | Unique Identification Authority of India | A report says that full data base has been exp... | 2017 | 1000000000 | Government | Poor security |
2 | Spambot | A misconfigured spambot has leaked over 700m r... | 2017 | 711000000 | Web | Poor security |
3 | Friend Finder Network | Usernames, email addresses, passwords for site... | 2016 | 412000000 | Web | Hacked |
4 | Equifax | If you have a credit report, there’s a good ch... | 2017 | 143000000 | Financial | Hacked |
In [9]:
RangeIndex(start=0, stop=270, step=1)
In [10]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f3a43ccaba8>
In [11]:
In [12]:
Entity | Story | Year | Records | Sector | Method | |
---|---|---|---|---|---|---|
0 | River City Media | A dodgy backup has allegedly resulted in over ... | 2017 | 1370000000 | Web | Accidentally published |
1 | Unique Identification Authority of India | A report says that full data base has been exp... | 2017 | 1000000000 | Government | Poor security |
103 | Yahoo | Happened in 2013 but only disclosed late 2016.... | 2013 | 1000000000 | Web | Hacked |
2 | Spambot | A misconfigured spambot has leaked over 700m r... | 2017 | 711000000 | Web | Poor security |
78 | Yahoo | Happened in 2014, but no. records stolen was o... | 2014 | 500000000 | Web | Hacked |
... | ... | ... | ... | ... | ... | ... |
53 | uTorrent | It's unclear what data has been breached, exac... | 2016 | 35000 | Web | Hacked |
194 | Morgan Stanley Smith Barney | Morgan Stanley mailed a CD containing sensitiv... | 2011 | 34000 | Financial | Lost / stolen device or media |
36 | Quest Diagnostics | Nov. The stolen data contained names, DOBs, la... | 2016 | 34000 | Healthcare | Hacked |
159 | Dropbox | Websites stolen from other websites used to si... | 2012 | 30000 | Web | Hacked |
54 | Wendy's | Malware has been used in 1025 of Wendy's resta... | 2016 | 1025 | Retail | Hacked |
270 rows × 6 columns
In [13]:
Year | Records | |
---|---|---|
count | 270.000000 | 2.700000e+02 |
mean | 2012.222222 | 3.275059e+07 |
std | 3.134332 | 1.341436e+08 |
min | 2004.000000 | 1.025000e+03 |
25% | 2010.000000 | 3.223948e+05 |
50% | 2012.000000 | 2.000000e+06 |
75% | 2015.000000 | 1.075000e+07 |
max | 2017.000000 | 1.370000e+09 |
In [14]:
In [15]:
pandas.core.groupby.generic.DataFrameGroupBy
In [16]:
Records | ||||||||
---|---|---|---|---|---|---|---|---|
count | mean | std | min | 25% | 50% | 75% | max | |
Year | ||||||||
2004 | 1.0 | 9.200000e+07 | NaN | 92000000.0 | 92000000.00 | 92000000.0 | 92000000.00 | 9.200000e+07 |
2005 | 4.0 | 1.105625e+07 | 1.937613e+07 | 125000.0 | 181250.00 | 2050000.0 | 12925000.00 | 4.000000e+07 |
2006 | 6.0 | 1.171667e+07 | 1.086617e+07 | 200000.0 | 2950000.00 | 10500000.0 | 19250000.00 | 2.650000e+07 |
2007 | 13.0 | 1.202203e+07 | 2.550905e+07 | 89000.0 | 1000000.00 | 3000000.0 | 8500000.00 | 9.400000e+07 |
2008 | 17.0 | 4.033324e+06 | 5.164474e+06 | 50500.0 | 113000.00 | 2100000.0 | 5000000.00 | 1.800000e+07 |
2009 | 14.0 | 1.834375e+07 | 3.831834e+07 | 72000.0 | 391284.25 | 1121604.5 | 7443033.50 | 1.300000e+08 |
2010 | 20.0 | 8.068238e+05 | 9.542018e+05 | 43000.0 | 174083.25 | 395000.0 | 975000.00 | 3.300000e+06 |
2011 | 35.0 | 6.556471e+06 | 1.484513e+07 | 34000.0 | 200000.00 | 1000000.0 | 4572433.00 | 7.700000e+07 |
2012 | 26.0 | 2.753285e+07 | 5.208669e+07 | 30000.0 | 785000.00 | 7500000.0 | 16625000.00 | 2.000000e+08 |
2013 | 31.0 | 4.233181e+07 | 1.787852e+08 | 100000.0 | 165000.00 | 1460000.0 | 5350000.00 | 1.000000e+09 |
2014 | 27.0 | 3.628846e+07 | 9.850089e+07 | 52000.0 | 1050000.00 | 4000000.0 | 15500000.00 | 5.000000e+08 |
2015 | 25.0 | 1.848113e+07 | 4.295943e+07 | 40000.0 | 400000.00 | 2700000.0 | 13000000.00 | 1.980000e+08 |
2016 | 33.0 | 3.463194e+07 | 7.684370e+07 | 1025.0 | 790724.00 | 6600000.0 | 40000000.00 | 4.120000e+08 |
2017 | 18.0 | 1.831136e+08 | 4.058881e+08 | 40000.0 | 1150000.00 | 3000000.0 | 27720469.25 | 1.370000e+09 |
In [17]:
(2014,
Entity \
6 Malaysian telcos & MVNOs
47 Privatization Agency of the Republic of Serbia
78 Yahoo
79 Ebay
80 JP Morgan Chase
81 Target
82 Home Depot
83 Korea Credit Bureau
84 Premera
85 Sony Pictures
86 Twitch.tv
87 Gmail*
88 Community Health Systems
89 European Central Bank
90 UPS
91 HSBC Turkey
92 AOL
93 Imgur
94 Staples
95 Neiman Marcus
96 D&B, Altegrity
97 MacRumours.com
98 Japan Airlines
99 Dominios Pizzas (France)
100 NASDAQ
101 Mozilla
102 New York Taxis
Story Year Records \
6 Oct. Data from numerous Malaysian telco & MVNO... 2014 46200000
47 A text file with personal data and financial d... 2014 5190396
78 Happened in 2014, but no. records stolen was o... 2014 500000000
79 The company has said hackers attacked between ... 2014 145000000
80 July 2014: The US's largest bank was compromis... 2014 76000000
81 Investigators believe the data was obtained vi... 2014 70000000
82 Malware installed on cash register system acro... 2014 56000000
83 NaN 2014 20000000
84 Detected 29th Jan 2015. Occured May 2014. Coul... 2014 11000000
85 Wide-ranging hack of potentially every piece o... 2014 10000000
86 March 23rd. Details unknown at this point. All... 2014 10000000
87 5 million Gmail account passwords leaked to a ... 2014 5000000
88 Aug 2014: Community Health Systems, which oper... 2014 4500000
89 NaN 2014 4000000
90 Malware was discovered in the credit & debit c... 2014 4000000
91 In a message to customers on its website, the ... 2014 2700000
92 NaN 2014 2400000
93 Imgur are still investigating how the breach t... 2014 1700000
94 NaN 2014 1160000
95 NaN 2014 1100000
96 Hackers stole millions of social security numb... 2014 1000000
97 NaN 2014 860000
98 Oct 2014: Japan Airlines confirmed the possibl... 2014 750000
99 NaN 2014 600000
100 Nasdaq forum website hacked by hacking ring, e... 2014 500000
101 NaN 2014 76000
102 A freedom of information request resulted in t... 2014 52000
Sector Method
6 Telecoms Hacked
47 Government Accidentally published
78 Web Hacked
79 Web Hacked
80 Financial Hacked
81 Retail Hacked
82 Retail Hacked
83 Financial Inside job
84 Healthcare Hacked
85 Media Hacked
86 Gaming Hacked
87 Web Hacked
88 Healthcare Hacked
89 Financial Hacked
90 Retail Hacked
91 Financial Hacked
92 Web Hacked
93 App Hacked
94 Retail Hacked
95 Retail Hacked
96 Tech Hacked
97 Web Hacked
98 Transport Hacked
99 Web Hacked
100 Financial Hacked
101 Web Poor security
102 Transport Poor security )
In [18]:
In [19]:
Records | |
---|---|
Year | |
2004 | 9.200000e+07 |
2005 | 1.105625e+07 |
2006 | 1.171667e+07 |
2007 | 1.202203e+07 |
2008 | 4.033324e+06 |
2009 | 1.834375e+07 |
2010 | 8.068238e+05 |
2011 | 6.556471e+06 |
2012 | 2.753285e+07 |
2013 | 4.233181e+07 |
2014 | 3.628846e+07 |
2015 | 1.848113e+07 |
2016 | 3.463194e+07 |
2017 | 1.831136e+08 |
In [20]:
RangeIndex(start=0, stop=270, step=1)
In [21]:
Records
Year
2004 9.200000e+07
2005 1.105625e+07
2006 1.171667e+07
2007 1.202203e+07
2008 4.033324e+06
2009 1.834375e+07
2010 8.068238e+05
2011 6.556471e+06
2012 2.753285e+07
2013 4.233181e+07
2014 3.628846e+07
2015 1.848113e+07
2016 3.463194e+07
2017 1.831136e+08
In [22]:
Text(0, 0.5, 'Median Records Breached')
In [23]:
<seaborn.axisgrid.FacetGrid at 0x7f3a41632400>
In [24]:
<seaborn.axisgrid.FacetGrid at 0x7f3a415edba8>
In [25]:
<seaborn.axisgrid.FacetGrid at 0x7f3a415356d8>
In [26]:
<seaborn.axisgrid.FacetGrid at 0x7f3a41480e80>