*I originally wrote this post for a software company’s blog in 2014. This company was bought out and recently their blog and website have been removed from the web permanently. I am reposting it here for posterity.*

Criminals also have their routines. Given that they are sometimes dealing with character flaws that make them unsuited for normal jobs such as unreliability, substance abuse and lack of stable environments their routines may not be quite as set as ours. However, even they have routines.

These routines are often cyclical. Additionally, once a criminal discovers a modus operandi (M.O.) that is successful they will often stick with that M.O. until something interrupts that cycle.

In this's post we're going to look at a simple method for predicting when a criminal will be likely to strike again in a crime series. First, we should answer what is a "crime series"?

A crime series is a group of crimes that are committed by the same offender or group of offenders working together. The armed robber who hits a number of convenience stores over a period of weeks is a crime series. A crime series is not all similar crimes for instance, all convenience store robberies, but only those that can be attributed to the same offender.

This is one reason that it is extremely important that your officers collect very detailed information about the exact M.O. used in a crime. Criminals don't usually identify themselves each time they commit a crime in a series so we often have to use M.O., physical description, etc. to figure out which ones are likely committed by the same offender.

Often times, criminals who commit a series of crimes fall into the 80/20 rule. This anecdotal "rule" says that 80% of your crimes may be committed by only 20% of your offenders. A criminal committing a series of crimes can often have a measurable impact on your crime rate. Making it a priority to put him out of commission can help make your crime numbers look much better at the end of the year.

Giving your officers some guidance about when a serial offender is likely to strike will improve the chances that they will be successful in stopping him. One way to do this is to make a prognostication about when the offender is more likely to strike next.

There are a number of ways to do this. Since this series of posts is geared towards analysts who may not have much formal crime analysis training, we're going to look at one of the easiest: calculating Mean Days Between Hits.

First lets arrange all the crimes in order of the date of occurrence. These examples are from a real armed robbery crime series I worked on a few years ago. Then we're going to calculate the number of days between occurrence. In the example I've arranged these "hits" into a table that looks like this:

Once we've done this, we're going to calculate the Mean (Average) of the Days Between Hits for the events in the series. I've done this with a spreadsheet using the Excel AVERAGE function.

The next part, is we're going to calculate the Standard Deviation for the events in the series. In a nutshell Standard Deviation is a measure of the variation of the data points from the average. In statistics, One Standard Deviation encompasses 68% of the data points. This is a pretty good accuracy for our purposes. In our example, the formula used for Standard Deviation is STDEV.

Below our "hits", I've added cells for the Mean, Standard Deviation and then an Earliest and Latest cell.

To calculate the Earliest prediction subtract the Standard Deviation from the Mean. The Latest prediction is calculated by adding the Standard Deviation to the Mean. What these Earliest and Latest calculations mean is that there is a 68% chance that the next event in the series will occur between 12/14/09 and 12/17/09.

Let's look at how our prediction fared when we add the next event in the series.

Our next event was three days after the last event. Using our Latest calculation, this was within the range we calculated. Now, we're going to recalculate our Mean and Standard Deviation using this next event.

This Earliest and Latest calculation isn't significantly different from the first. Using these numbers we can expect our crook to hit from 12/17/09 to 12/20/09. So let's see how we fare when we add the last event in the series.

12/23/09? Six days between hits? What happened here?

What happened is that something interrupted the cycle and caused the next event to take place outside one standard deviation (68%) of the mean. It could be that the crook was sick, jailed, or his car broke down. Even crooks have events outside their control that interrupt their routines.

This demonstrates a weakness in making predictions in crime series. However it also demonstrates an important point when making these prognostications. That is, when making these predictions it is important to remember to allow for something out of the ordinary. Language such as "If the series continues as it has we can expect him to hit on this date..." can help to explain these anomalies.

A couple of caveats: There are other more sophisticated ways to calculate these predictions that may be more accurate. However, Mean Days Between Hits is a good way to get in the ballpark and doesn't take an understanding of advanced statistics to calculate.

Also, the more events you have in a series the more accurate your calculations are likely to be. You should probably have 4 or 5 events in a series before you try to predict when the next event will be.

What types of crime series do you think lend themselves to this type of analysis?

I liked presenting information in this way as well. I did it for time in-between shootings (so not the same person, but often related/retaliatory events).

ReplyDeleteSee https://andrewpwheeler.wordpress.com/2016/02/26/using-and-making-cumulative-probability-charts/ for a chart to show the entire distribution. (I give an example excel spreadsheet showing how to make it.)

Thanks Andrew!

ReplyDelete