Lights! Camera! Freeze!

(Photo by Flickr user West Midlands Police, used under a Creative Commons license)

(Photo by Flickr user West Midlands Police, used under a Creative Commons license)

A recent article in Time on the massive growth of data being generated by the use of body cameras caught the attention of the Everydata team for the plain and simple reason that they are generating a massive amount of data!  Some questions to ask:

  • "18,000 state and local police departments"? Depends on how you classify police department. The last Bureau of Justice Statistics (BJS) Census of State and Local Law Enforcement Agencies counts "17,985 state and local law enforcement agencies with at least one full-time officer or the equivalent in part-time officers." This figure includes just over 3,000 sheriff's departments who would probably be the first to point out that they are not police departments/officers.

  • 10,000 hours of video per week for "big city" departments?  Sounds like a large number, but would be easy when you consider 10,000 hours translates to roughly 60 body cams filming 24 hrs / 7 days a week. New Orleans, the 50th largest city according to the Census, had roughly 1,200 officers in 2013 (slide 7 of this presentation). 60 cameras would cover 5% of the force.

Here was a particularly mind-blowing line from the perspective of a data geek: "a video is uploaded to evidence.com every 1.6 seconds, equaling 2.1 petabytes (one petabyte equals a million gigabytes)." 2.1 petabytes (PB) is a lot of data!   To give you some perspective, according to the data science program at Berkeley, 1 petabyte is the equivalent of 20 million four drawer filing cabinets filled with text.  In our upcoming book, we point out that the average person consumes 34 gigabytes (GB) of data per day. 1 PB would equate to just over 80 years at the rate of 34 GBs per day. Imagine filming every moment of your day for your entire life!

Working with data for a living, my mind is also wondering what system they are using to classify or navigate that information? The author lays out how difficult and costly the task of storing data is, but what are the departments doing to actually use the data? Usage of the data opens up a whole other quandary of questions for another post.