Article Text

Download PDFPDF

Do inadequacies in ICD-10-AM activity coded data lead to underestimates of the population frequency of sports/leisure injuries?
  1. C F Finch1,2,
  2. S Boufous2
  1. 1
    School of Human Movement and Sport Sciences, University of Ballarat, Mt Helen, Australia
  2. 2
    NSW Injury Risk Management Research Centre, Univeristy of New South Wales, Sydney, Australia
  1. Professor Caroline F Finch, School of Human Movement and Sport Sciences University of Ballarat, PO Box 663, Mt Helen, Australia; c.finch{at}


Aims: To assess the use of the International Classification of Diseases Australian Modification (ICD-10-AM) activity sub-codes for identifying sports/leisure injury hospitalizations and the impact of missing codes on population incidence estimates.

Methods: Injury-related hospital separations in New South Wales, Australia, for the period 2003–04 were examined with sports/leisure cases identified by the ICD-10-AM activity codes.

Results: Over 30% of all injury hospitalizations had either a missing or unspecified activity code. Among cases with valid activity codes, 13.9% of all injury hospitalizations were associated with sports/leisure. When adjusted for underreporting associated with undefined or missing activity codes, sports/leisure injuries accounted for up to 20% of injury hospitalizations.

Conclusion: Defining sports/leisure injury cases on the basis of activity codes is likely to lead to an underestimate of their contribution to the overall injury burden. Improvements need to be made to the completeness of activity coding of hospitalization data.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Internationally, routine health sector databases use the International Classification of Diseases, version 9 (ICD-9) or 10 (ICD-10) to code injury data; sports/leisure injury cases can be identified on the basis of some specific codes under ICD-10. The Australian modification of the ICD-10 (ICD-10-AM), as is used in Australia and New Zealand, is unique internationally in that it has over 200 activity codes to identify specific sports/leisure activities associated with injury.1 These codes have influenced the development of similar codes in the International Classification of External Causes of Injury.2

Previous studies have investigated the completeness of the principal injury,3 external cause45 and place of occurrence ICD-9/10 codes.4 Despite their wide use, at least in countries using the ICD-10-AM, the completeness and use of sports/leisure activity codes has not previously been assessed. While Langley et al4 found that 39% of overall injury cases had an unspecified activity code, they did not specifically look at the codes directly related to sports/leisure, nor did they include cases with missing values.

This paper assesses the impact of unspecified and missing activity codes on estimates of the frequency of sports/leisure injury hospitalizations.


Injury hospitalization episode data were extracted from the routinely collected New South Wales (NSW), Australia Inpatient Statistics Collection (ISC) which includes details of all inpatient separations (discharges, transfers and deaths) from all public, private and repatriation hospitals, private day procedures centers and public nursing homes in NSW.6 Cases extracted were all NSW-based injury hospital separations, of NSW residents of any age, with an ICD-10-AM principal diagnosis indicating an injury (S00–T35, T66–T71, T73, T75, T90–T95). Transfers between hospitals and statistical discharges/transfers within the same hospital were excluded to remove readmissions for the same injury. Data were extracted for two calendar years (2003–04) and combined for all analyses.

Sports/leisure injuries were identified if the injury hospitalization was assigned ⩾1 of over 200 different sub-codes provided by the ICD-10-AM “activity while injured” chapter in the range U50–U72.1 The ISC records up to three activity codes but, as fewer than 0.5% of cases have >1 such code, only the first such code was used. Unfortunately, a number of cases are assigned either an unspecified code (U73.9, indicating that the coder could not find any information about the activity at the time of injury in the medical/hospital notes) or the code is missing (indicating that no information was provided by the coder in the dataset). The U71 code indicates that the activity was sports/leisure related but that the specific activity was unknown.

A sensitivity analysis of the range of possible impacts that the underreporting of activity, due to unspecified and missing values, has on estimates of the frequency of sports/leisure injury hospitalizations was undertaken using the following assumptions:

  1. No case with an unspecified/missing activity code is actually a sports/leisure injury.

  2. Some cases with an unspecified/missing activity code are actually sports/leisure injuries, and the percentage of sports/leisure cases in the unknown set is equivalent to that among the cases with a known activity code.

  3. A large proportion of the cases with unspecified/missing activity codes are actually sports/leisure injuries, and the percentage of sports/leisure cases in the unknown cases is (a) 25%, (b) 50% or (c) 75%.

These assumptions provide a range from very conservative (assumption 1) to being most likely (assumption 2) to providing a maximal bound (assumption 3c). Assumption 2 assumes that there is no differential misclassification rate among actual sports/leisure cases and true non-sports/leisure cases. The maximal bound of assumption 3c is relatively arbitrary, but has been chosen to be less than 100% because it is extremely unlikely that all cases with missing/unspecified activity codes would be associated with sports/leisure.

All results are presented as the proportion of all NSW injury hospitalizations during 2003–04 or the proportion of all cases identified as a sports/leisure hospitalization, as appropriate. External cause codes were used to categorize various mechanisms of injury (see table 1), but were not used to select cases.

Table 1 Distribution of activity codes across common categories of external causes of injury, NSW hospitalization episodes, 2003–04

The use of the hospitalizations data was approved by the University of New South Wales Ethics Committee.


Table 1 shows the distribution of activity codes, including missing and unspecified values, in the total set of 182 951 injury hospitalizations. More than half had activity codes unrelated to sports/leisure; 13.9% were specifically identified as being associated with sports/leisure. There were relatively low levels of missing activity codes. Almost one third of cases had either a missing or an unspecified activity code. The frequency of having a valid/specified sports/leisure activity code varied across external cause categories. Unspecified activity codes were most common in exposure to unknown factor cases. The high rate of unspecifiedactivity codes in the falls category has been previously noted4 and is considered to be most relevant to the problem of falls in older people.

Table 2 shows the results of the sensitivity analysis. Overall, under assumption 1, sports/injury leisure hospitalizations would comprise 13.9% of all injury hospitalizations. Under assumptions 2 and 3, an additional 6.0–22.9% of all injury hospitalizations might be sports/leisure related.

Table 2 Impact of different assumptions about the proportion of missing/unspecified activity codes that may be truly sports/leisure cases on the overall sports injury frequency estimates, NSW hospitalization episodes, 2003–04


This study is the first to specifically examine the use of ICD-10-AM activity codes in identifying sports/leisure injuries. Although this study only considered ICD-10-AM codes, the results are relevant to other regions considering expanding the range of activity codes in their ICD-schema in the future. This study also has broader implications for the coding of sports injury cases in general in that it clearly shows that if there is a large proportion of missing or unspecified data, then conclusions about the magnitude of estimates of sports/leisure injury incidence are likely to be underestimates.

As valid activity codes are the only way to identify sports/leisure cases in ICD-10-AM coded data, their usage levels are likely to have major impact on estimates of the frequency of sports/leisure hospitalizations. Almost one third of all NSW-based injury hospitalizations had either a missing or an unspecified activity code, which is slightly less than the age-standardized rate of 39% reported by Langley et al4 for New Zealand data. The sensitivity analysis shows how the use of incomplete data can have a significant impact on conclusions about the frequency of sports/leisure injuries. After adjusting for potential underreporting associated with missing/unspecified activity codes, it is possible that sports/leisure injuries could account for up to one in five injury hospitalizations and >40% of all overexertion and strenuous or repetitive movements, drowning and struck by/against hospitalizations.

Key points

  • The ICD-10-AM provides useful codes for identifying specific sport/leisure activities leading to injury.

  • The quality of the ICD-10-AM activity data is limited by the completeness of activity codes.

  • Given current levels of incompleteness of activity codes used to identify sport/leisure, it is likely that estimates of sports injury frequency are underestimates.

Surprisingly, few studies have examined the use of the ICD-10-AM coding scheme in relation to sports/leisure cases. Rae et al7 assessed the ICD codes for classifying sports medicine diagnoses and concluded that some improvement needed to be made. In another Australian study, activity-code identified sports/leisure struck by/struck against injury hospitalizations were found to be lacking for the targeting of specific injury prevention strategies.8 If the one in five case frequency is confirmed, sports/leisure injuries will be clearly identified as an area for priority attention, and so the use of quality data to identify such cases will be even more important.

Identified gaps in the sports/leisure injury codes indicate that improvements in the coding of routinely collected hospitalization data is warranted. Identification of sports/leisure cases depends on the recording of accurate and specific information about the nature of the activity undertaken at the time of injury. Reasons for the relatively high levels of missing/unspecified activity data need to be determined in order to identify what solutions could be implemented to address these information gaps. There are at least three plausible reasons which would need different strategies to overcome: information not being recorded by the treating health professional; information is present in the medical records but the medical coder fails to code it; or inadequacies in the coding scheme itself do not facilitate the correct identification of sports/leisure activities.45 Given the strong association of sports injury causation with the external cause categories of overexertion and strenuous or repetitive movements, drowning and struck by/against, efforts should perhaps first be directed at these categories. Finally, continued development of the ICD-10-AM activity codes should include consultation with experts in the use and interpretation of sports/leisure injury data.

In conclusion, future studies using ICD-10-AM coded data should continue to use the activity codes to select sports/leisure cases. The use of activity codes is still quite recent and until there are improvements in data coding, estimates of sports/leisure injury based on such data need to be recognized as being a likely underestimate of the true population frequency of these injuries. Further sensitivity analyses and validation studies of evolving activity codes should also be undertaken, especially as some external causes, such as drowning, are more likely to be sports/leisure-related.


ICD-10-AM activity codes are useful for identifying sports/leisure injuries. However, incompleteness in the codes, associated with missing/unspecified values, leads to sports/leisure frequency estimates that are likely to underestimate the true incidence. Further research should examine why there are so many missing/unspecified cases and identify approaches for rectifying this.



  • Funding: This project was funded by the NSW Sporting Injuries Committee (NSWSIC) under its Research and Injury Prevention Scheme. CFF was supported by an NHMRC Principal Research Fellowship. SB was supported by the NSWSIC Grant and the NSW Injury Risk Management Research Centre (IRMRC) Core Funding, which is provided by the NSW Department of Health, the NSW Roads and Traffic Authority and the NSW Motor Accidents Authority. The hospitalizations data were accessed via the NSW Department of Health’s Health Outcomes Information Statistical Toolkit (HOIST), maintained by the Centre for Epidemiology and Research.

  • Competing interests: None.

  • Ethics approval: The use of the hospitalizations data was approved by the University of New South Wales Ethics Committee.