Harnessing information from injury narratives in the ‘big data’ era: understanding and applying machine learning for injury surveillance

Kirsten Vallmuur; Helen R Marucci-Wellman; Jennifer A Taylor; Mark Lehto; Helen L Corns; Gordon S Smith

doi:10.1136/injuryprev-2015-041813

Article Text

Original article

Harnessing information from injury narratives in the ‘big data’ era: understanding and applying machine learning for injury surveillance

Kirsten Vallmuur1,
http://orcid.org/0000-0002-4143-6340Helen R Marucci-Wellman2,
Jennifer A Taylor3,
Mark Lehto4,
Helen L Corns5,
Gordon S Smith6

¹Queensland University of Technology, Centre for Accident Research and Road Safety—Queensland, Brisbane, Queensland, Australia
²Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, Massachusetts, USA
³Department of Environmental & Occupational Health, School of Public Health, Drexel University, Philadelphia, Pennsylvania, USA
⁴School of Industrial Engineering, Purdue University, West Lafayette, Indiana, USA
⁵Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, Massachusetts, USA
⁶National Center for Trauma and EMS, University of Maryland School of Medicine, Baltimore, Maryland, USA

Correspondence to Dr Kirsten Vallmuur, Queensland University of Technology, Centre for Accident Research and Road Safety—Queensland,130 Victoria Park Road, Kelvin Grove 4059, Brisbane, QLD 4053, Australia; k.vallmuur{at}qut.edu.au

Abstract

Objective Vast amounts of injury narratives are collected daily and are available electronically in real time and have great potential for use in injury surveillance and evaluation. Machine learning algorithms have been developed to assist in identifying cases and classifying mechanisms leading to injury in a much timelier manner than is possible when relying on manual coding of narratives. The aim of this paper is to describe the background, growth, value, challenges and future directions of machine learning as applied to injury surveillance.

Methods This paper reviews key aspects of machine learning using injury narratives, providing a case study to demonstrate an application to an established human-machine learning approach.

Results The range of applications and utility of narrative text has increased greatly with advancements in computing techniques over time. Practical and feasible methods exist for semiautomatic classification of injury narratives which are accurate, efficient and meaningful. The human-machine learning approach described in the case study achieved high sensitivity and PPV and reduced the need for human coding to less than a third of cases in one large occupational injury database.

Conclusions The last 20 years have seen a dramatic change in the potential for technological advancements in injury surveillance. Machine learning of ‘big injury narrative data’ opens up many possibilities for expanded sources of data which can provide more comprehensive, ongoing and timely surveillance to inform future injury prevention policy and practice.

https://doi.org/10.1136/injuryprev-2015-041813

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

View Full Text

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Read the full text or download the PDF:

Log in using your username and password

Read the full text or download the PDF:

Log in using your username and password