Algorithm learns to predict the perp on ‘CSI’ by binge-watching episodes

csi prediction algorithm crime scene investigation

How many episodes of forensics procedural drama CSI: Crime Scene Investigation do you think you would have to watch before you are able to guess the perpetrator correctly? A group of researchers at the U.K.’s University of Edinburgh decided to find out — only using an algorithm, rather than a person, to carry out the prediction.

To do this, they trained a neural network to solve the various “whodunnit” crimes on the show by getting it to binge-watch episodes of the series. The resulting model makes inferences about the identity of the perpetrator based on the information it encounters as each episode unfolds. This was achieved using a combination of image, audio, and the episode scripts, with the machine then asked to weigh each clue’s relative importance for solving the crime. The neural network watched 39 episodes of CSI in total, including 59 different cases.

“In the final part of the episode, at a point when the true perpetrator has presumably been revealed, our model correctly identifies mentions of the perpetrator 60 percent of the time,” Lea Frermann, a researcher on the project, told Digital Trends. “[By comparison], humans correctly identify perpetrator mentions 85 percent of the time. We show that access to information from multiple modalities, as well as the ability to keep a flexible record of what happened previously in the episode is important, helps the model to identify the perpetrator.”

Frermann points out that, while humans are significantly more accurate in identifying perpetrators, they also tend to be more cautious in their guesses and wait until later on to start making them. “Overall, there is still a large gap between model and human performance, but our initial results are encouraging,” she said.

Don’t expect the algorithm to be used as a real-life crime scene investigator anytime soon, though. Notwithstanding the fact that it is less accurate than humans at picking the right suspect, Frermann notes that real crimes aren’t quite as neat as the micro-worlds presented on TV.

CSI episodes are 40 minutes long, the plot is completely self-enclosed, and the number of participants highly restricted,” she said. “Real scenarios are unequally more complex.”

Despite this, she pointed out that the work is an interesting testbed for future research on machine-learning models used for solving tasks which require complex reasoning, such as information retrieval or question answering. The team is also interested in seeing whether an A.I. trained on CSI can correctly guess the perpetrator on other procedural shows like Law & Order.

You can read a paper describing the work, “Whodunnit? Crime Drama as a Case for Natural Language Understanding,” here.