Thursday, April 5, 2012

On Data Mining And Finding The Forest Among All The Trees

This isn't directly law enforcement related but is worth noting: The Atlantic had a great article that looked at some of the basic concepts behind data mining. 
For the most part, data mining tells us about very large and complex data sets, the kinds of information that would be readily apparent about small and simple things. For example, it can tell us that "one of these things is not like the other" a la Sesame Street or it can show us categories and then sort things into pre-determined categories. But what's simple with 5 datapoints is not so simple with 5 billion datapoints.

And these days, there's always more data. We gather far more of it then we can digest. Nearly every transaction or interaction leaves a data signature that someone somewhere is capturing and storing. This is, of course, true on the internet; but, ubiquitous computing and digitization has made it increasingly true about our lives away from our computers (do we still have those?). The sheer scale of this data has far exceeded human sense-making capabilities. At these scales patterns are often too subtle and relationships too complex or multi-dimensional to observe by simply looking at the data. Data mining is a means of automating part this process to detect interpretable patterns; it helps us see the forest without getting lost in the trees.
As we are seeing more and more data mining occurring in law enforcement especially in the adoption of predictive policing, it's important that crime analysts become very familiar with what data mining is and what it isn't. The article is worth the read, especially for someone like me who's not a statistician by training.

What is your agency doing to harness the information contained in the data your agency collects?

No comments:

Post a Comment

I reserve the right to remove defamatory, libelous, inappropriate or otherwise stupid comments. If you are a spammer or are link baiting in the comments, a pox be upon you. The same goes for people trying to sell stuff. Your comment will be deleted without mercy.