Drunk and disorderly data: applying natural language processing to identify alcohol-related crimes in police data – Police Practice and Research

‘Alcohol plays a significant role in crime, and whilst police data may contain a measure of alcohol-related crime, it is not entirely accurate. This study presents a new measure of alcohol-related crime in police data. A natural language processing algorithm was applied to crime descriptions in police data, to identify crimes as alcohol-related or not. This algorithm estimated that a higher proportion of crime is linked to alcohol than current police estimates suggest, across crime type and over time. It also estimated there to be twice as many alcohol-related crimes than the police measure in the most deprived areas. However, the proportion of alcohol-related crime was estimated to be just as high in lesser deprived areas as it was in the most deprived areas, unobserved by current police estimates. This study illustrates the potential for algorithms to improve estimates and progress policing practice and decision-making around alcohol-related crime to reduce its burden.’

Link: https://www.tandfonline.com/doi/full/10.1080/15614263.2025.2508275?ui=w7xas&af=T&ai=p5wor