Computerized coding of injury narrative data from the National Health Interview Survey

Accid Anal Prev. 2004 Mar;36(2):165-71. doi: 10.1016/s0001-4575(02)00146-x.

Abstract

Objective: To investigate the accuracy of a computerized method for classifying injury narratives into external-cause-of-injury and poisoning (E-code) categories.

Methods: This study used injury narratives and corresponding E-codes assigned by experts from the 1997 and 1998 US National Health Interview Survey (NHIS). A Fuzzy Bayesian model was used to assign injury descriptions to 13 E-code categories. Sensitivity, specificity and positive predictive value were measured by comparing the computer generated codes with E-code categories assigned by experts.

Results: The computer program correctly classified 4695 (82.7%) of the 5677 injury narratives when multiple words were included as keywords in the model. The use of multiple-word predictors compared with using single words alone improved both the sensitivity and specificity of the computer generated codes. The program is capable of identifying and filtering out cases that would benefit most from manual coding. For example, the program could be used to code the narrative if the maximum probability of a category given the keywords in the narrative was at least 0.9. If the maximum probability was lower than 0.9 (which will be the case for approximately 33% of the narratives) the case would be filtered out for manual review.

Conclusions: A computer program based on Fuzzy Bayes logic is capable of accurately categorizing cause-of-injury codes from injury narratives. The capacity to filter out certain cases for manual coding improves the utility of this process.

Publication types

  • Evaluation Study

MeSH terms

  • Forms and Records Control / methods*
  • Fuzzy Logic
  • Health Surveys
  • Humans
  • Medical Records Systems, Computerized*
  • Models, Theoretical
  • Predictive Value of Tests
  • Software
  • Trauma Severity Indices*
  • United States / epidemiology
  • Wounds and Injuries / classification*
  • Wounds and Injuries / epidemiology