Background The international classification of diseases version 10 (ICD-10) uses alphanumeric expanded codes and external cause of injury codes (E-codes).
Objective To examine the reliability and validity of emergency department (ED) coders in applying E-codes in ICD-9 and -10.
Methods Bicycle and pedestrian injuries were identified from the ED information system from one period before and two periods after transition from ICD-9 to -10 coding. Overall, 180 randomly selected bicycle and pedestrian injury charts were reviewed as the reference standard (RS). Original E-codes assigned by ED coders (ICD-9 in 2001 and ICD-10 in 2004 and 2007) were compared with charts (validity) and also to ICD-9 and -10 codes assigned from RS chart review, to each case by an independent (IND) coder (reliability). Sensitivity, specificity, simple, and chance-corrected agreements (κ statistics) were calculated.
Results Sensitivity of E-coding bicycle injuries by the IND coder in comparison with the RS ranged from 95.1% (95% CI 86.3 to 99.0) to 100% (95% CI 94.0 to 100.0) for both ICD-9 and -10. Sensitivity of ED coders in E-coding bicycle injuries ranged from 90.2% (95% CI 79.8 to 96.3) to 96.7% (95% CI 88.5 to 99.6). The sensitivity estimates for the IND coder ranged from 25.0% (95% CI 14.7 to 37.9) to 45.0% (95% CI 32.1 to 58.4) for pedestrian injuries for both ICD-9 and -10.
Conclusion Bicycle injuries are coded in a reliable and valid manner; however, pedestrian injuries are often miscoded as falls. These results have important implications for injury surveillance research.
- ICD codes
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.
Statistics from Altmetric.com
Edmonton is one of the two largest cities in Alberta, with a population of 1 034 945 (city =730 372; other metropolitan areas =304 573) (reported by Statistics Canada, 2006). Four periods of a biannual Canadian community health survey (2001–07) demonstrated that the prevalence of recreational bicycle use in Alberta among youth (12–17 years of age) was in the range 58–65% and the mean number of times adolescents bicycled in the past 3 months was in the range 16–30%. The prevalence of recreational bicycling for adults (18+) was 24–28% and mean number of times adults bicycled was 17–19 in the past 3 months.1 In the same study the prevalence of commuting bicycle use among youth in Alberta was 31–35% and among adults was 6–7%.1
A nationwide Canadian study demonstrated that 2% of all hospitalisations were due to bicycle-related injuries during a 10-year period (1994–2004).2 A 5-year study (1991–95) in British Columbia demonstrated that 4% of visits of all children (1–19 years of age) to the emergency department (ED) resulted from bicycle-related injuries.3
In order to conduct surveillance studies, we require valid coding of patient's diseases and circumstances of the event leading to admission to EDs or hospitals. Inpatient coding has been established in hospitals for use in disease and mortality surveillance, epidemiological studies, billing and financial planning, and policy analyses.4 5
The International Classification of Diseases versions 9 (ICD-9) and 10 (ICD-10) have been used to code diseases and other health problems recorded on many types of health and vital records, including death certificates, hospitalisation and ED data. The ICD-10 classification is the latest in a series which has its origins in the 1850s.6 ICD-10 was endorsed by the 43rd WHO assembly in May 1990 and began implementation in WHO member states in 1994.7 The differences between ICD-9 and ICD-10 are substantial, not only in disease classification, but also in coding rules. As the ICD-9 system has been used by many hospitals and clinics for years and is still used in many US centres, this transition introduced some challenges for long-term and comparative studies.8
ICD-9 diagnosis codes consist of the following: 3-digit numeric characters (001–999), with two decimals representing illnesses and conditions; alphanumeric E-codes (E000–E999), describing external causes of injuries, poisonings, and adverse effects; and V codes (V01–V82), describing factors influencing health status and contact with health services. ICD-10 uses 3-digit alphanumeric codes (A00–Z99) with two decimals.7 9 There are many other changes in ICD-10 that have been described in detail elsewhere.10 Canada implemented ICD-10 for the classification of cause of death, beginning in 2000.8 In an agreement with WHO, Canada adopted an enhanced version of ICD-10-CA by keeping the main structure of ICD-10, yet including more subgroup definitions using a third decimal and introducing the first Canadian classification of intervention.11
E-codes have been widely used in surveillance system for mortality in traffic-related injuries12 and morbidity of bicyclists, pedestrians, and those with sport and recreational related injuries13–15; however, coding issues might have led to errors in the interpretation of research findings.16 Appropriate coding by ICD-9 and ICD-10 has always been an important issue for health surveillance and health services research. Many studies have evaluated the validity and/or reliability of ICD-9 coding for external causes of injuries (from now on called E-coding for both ICD-9 and ICD-10),5 17–21 or focused on principal diagnosis.22 Other studies evaluated the validity/reliability of ICD-10 for E-coding23 24 or only principal diagnosis in ICD-10.25
For those countries that implemented ICD-10, the transition from ICD-9 to ICD-10 may have had an impact on the trends of causes of injuries. Bridge coding studies, to date, have evaluated the impact of a coding change by focusing on principal causes of mortality and not external cause of injury.8 10 26–30 One study examined the usefulness of ICD-10-CM in capturing public health diseases (reportable diseases, leading cause of death and morbidity/mortality related to terrorism) and reported agreement levels of coders when coding such diseases in ICD-9-CM and ICD-10-CM. They found that ICD-10-CM was more specific and fully captured more diseases than ICD-9-CM; however, coders were more consistent in coding ICD-9-CM than ICD-10-CM.31
A long-term surveillance study in Alberta, Canada has shown that transition from ICD-9-CM to ICD-10-CA appeared to cause a decrease in the number of motor-vehicle-related deaths/hospital admissions, with a smaller impact on motor-vehicle ED visits.32 Similar studies have demonstrated that transition from ICD-9 to ICD-10 can affect ranking of causes of death,29 possibly resulting in a decrease in diseases such as pneumonia or an increase in cerebrovascular diseases.30
In Alberta, ICD-10-CA codes were implemented on 1 January 2000 for deaths and 1 April 2002 for morbidity data (hospitalisation and ED records).32 Concurrently, the Alberta government implemented a law mandating bicyclists <18 years of age to wear helmets, effective 1 May 2002.33 Given the timing of the bicycle helmet legislation and the coding change, it was essential to investigate whether the coding transition may have influenced the overall incidence of cycling-related injuries independent of the legislation. Therefore, the aim of this study was to evaluate the reliability and validity of ED coders in applying ICD-9-CM and ICD-10-CA external cause of injury codes for bicyclists.
In our study we used pedestrian injuries to establish how coding changes affected injury trends in another vulnerable road user group not affected by bicycle helmet legislation.
We identified all cycling- and pedestrian-related injuries from the Emergency Department Information System (EDIS) software.34 This system captures data on all patients presenting to the ED, including patient demographics, illness and circumstances of injury, times of arrival and care, injury descriptions, symptoms, consultations, and triage/vital signs assessment. Using the patient's complete paper chart, medical record nosologists (henceforth referred to as ED coders) assign ICD-9-CM before or ICD-10-CA, after 1 April 2002, after reviewing physician-assigned diagnoses at the time of ED discharge (home or hospital). EDIS review and case selection was performed for the four busiest cycling months of the year (May to August) in three separate years (2001=pre-transition to ICD-10; 2004; 2007=post-transition to ICD-10).
On the basis of ICD-9 and ICD-10 E-code descriptions,7 bicycle and pedestrian injuries were defined (see appendix) and used by investigators to identify all cases from the EDIS database. The keywords from these definitions were used for searching cases of bicycle and pedestrian injuries admitted to the EDs. Keywords for bicycle injuries included: bike, biking, cycle, bicycle, bicycling, bike injuries, cycle injuries, bicycle injuries, biking, and tricycle. Keywords for pedestrian injuries included: pedestrian, walking, jogging, car-ped, side walk, curb, cross walk, hit by (bicycle, motorcycle, car, or bus), ran over, parking lot. A variety of misspellings of bicycle (eg, bik, bicycl) and pedestrian terms (e.g., wlk, jogin) were also used to make sure we have not missed any cases due to typing mistakes.
After retrieving all relevant cases, two separate pools of bicycle and pedestrian injuries were prepared from adjudication with senior nursing staff (making sure they were valid bicycle and pedestrian injuries); research assistants randomly selected and reviewed 180 bicyclist and 180 pedestrian presentations (360 in total) from three hospital EDs in Edmonton (University of Alberta Hospital, Stollery Children's Hospital, and North East Community Health Center). Our sample included 60 injured cyclists and 60 injured pedestrians in each year.
A specific data extraction form was designed to capture necessary information from patients' paper charts. Using the extracted information, an independent expert coder (IND coder) was employed to assign both ICD-9-CM and ICD-10-CA codes. The IND coder was not aware of any previous coding associated with a bicyclist or pedestrian injury, nor the study hypothesis. After providing both ICD-9-CM and ICD-10-CA codes for each case, we merged these data with administrative data from the ambulatory care classification system (ACCS), a central electronic database for diagnosis, procedure, healthcare utilisation, and follow-up of emergency department patients in Alberta, Canada which was originally produced by ED coders. Therefore, each case had an ICD-9-CM code (before 1 April 2002) or an ICD-10-CA code (after 1 April 2002) assigned by ED coders as usual practice, forming part of the electronic administrative health record, with ICD-9-CM and ICD-10-CA codes assigned by the IND coder.
Data were analysed using Stata IC V.11.35 Examining validity, we calculated sensitivity, with 95% CIs, as the proportion of all cycling injuries identified through our chart review (reference standard) that were similarly coded as bicycle injuries by the ED and IND coders. Specificity was calculated as the proportion of all no bicycle injuries in our charts that were not coded as bicycle injuries by ED or IND coders. Similar sensitivity and specificity estimates and 95% CIs were produced for ED (ACCS data) and IND coders for pedestrians.
Simple percentage agreement between the two coders was calculated. Since simple percentage agreement does not account for agreement by chance, we used Cohen's κ statistic, a measure of chance-corrected proportional agreement.36 κ agreement was defined a priori as almost perfect (0.81–1.0), substantial (0.61–0.8), moderate (0.41–0.60), fair (0.21–0.40), slight (0.0–0.20), or poor (<0.0).37
We constructed separate 2×2 tables for ICD-9-CM and ICD-10-CA by year. We calculated percentage agreement and κ for coding between the ED coders and the IND coder. For sensitivity and agreement analysis, pedestrians were used as negative cases for bicyclists and vice versa.
After we finished our analyses and on the basis of our reference standard medical chart reviews, many of the pedestrian injuries were not E-coded accurately; therefore, we decided to perform a post-hoc investigation for those misclassified E-codes among pedestrian injuries.
We based our sample size on sensitivity, or the proportion of all EDIS identified cycling injuries in Edmonton transferred to ACCS. Our focus was on estimation (CIs) rather than statistical testing. For a CI width of 15% (±7.5%), assuming a worst case of 50% sensitivity, we would require 171 subjects. Therefore, with 180 subjects, the 95% CI around the estimate of sensitivity was expected to be less than 10%.
We obtained ethical approval from the University of Alberta Health Research Ethics Board. Patients were not contacted during this study.
Validity of E-coding by ED and IND coder
Sensitivity of E-coding bicycle injuries by ED coders in comparison to the reference standard (RS) ranged from 90.2% (95% CI 79.8 to 96.3) in 2007 to 96.7% (95% CI 88.5 to 99.6) in both 2001 and 2004 (table 1, bicycle injuries). Sensitivity of E-coding bicycle injuries by the IND coder in comparison to the RS ranged from 95.1% (95% CI 86.3 to 99.0) in 2007 to 100% (95% CI 94.0 to 100) in 2001 (table 1, bicycle injuries).
Sensitivity of E-coding pedestrian injuries by ED coders in comparison to the RS ranged from 25.0% (95% CI 14.7 to 37.9) in 2004 to 38.3% (95% CI 26.1 to 51.8) in 2001.The sensitivity estimates for the IND coder in coding pedestrian injuries compared with the RS ranged from 30.0% (95% CI 18.8 to 43.2) in 2004 to 43.3% (95% CI 30.6 to 56.8) in 2001 (table 1, pedestrian injuries).
Specificities for bicycle injuries were 98.3–100% and for pedestrian injuries were all 100% (not presented in table 1).
Validity of E-codes in ICD-10 and ICD-9
The results of the validity analysis showed that sensitivity of E-coding bicycle injuries by the IND coder using ICD-10 for the pre-transition year of 2001 was 98.3% (95% CI 91.1 to 100); sensitivity for ICD-9 for post-transition was 98.3% (95% CI 91.1 to 100) in 2004 and 96.7% (95% CI 88.7 to 99.6) in 2007 (table 1, shaded rows bicycle injuries). Sensitivity of E-coding for pedestrian injuries by the IND coder using ICD-10 for the pre-transition year (2001) was 45% (95% CI 32.1 to 58.4); retesting results of ICD-9 for post-transition were 25.0% (95% CI 14.7 to 37.9) in 2004 and 37.3% (95% CI 25.0 to 59.0) in 2007 (table 1, shaded rows pedestrian injuries).
Reliability of E-coding between ED and IND coders
Examining chance-corrected agreement (κ) and applying Landis and Koch's37 ranking of κ, agreement between ED coders and the IND coder for bicycle injuries was almost perfect, ranging between 0.88 and 0.97 (κpooled=0.94; 95% CI 0.91 to 0.98). Similarly, almost perfect agreement was seen in the comparison of ED coders to the IND coder for pedestrian injuries ranging between 0.90 and 0.97 (κpooled=0.92; 95% CI 0.87 to 0.98) (table 2).
Post-hoc results for pedestrian E-coding
Approximately 3.4% of pedestrian injuries that we identified and confirmed through EDIS and chart review were not assigned an external cause of injury by the ED coder. ED coders also misclassified between 59% (ICD-9-CM) and 69% (ICD-10-CA) of pedestrian injuries. Of the 59% misclassified pedestrian injuries in ICD-9-CM, 70% were miscoded as falls, 18% as unspecified, and 3% as unspecified vehicle collision; 3% overexertion, and 6% other. Of the 69% misclassified pedestrian injuries in ICD-10-CA, 74% were miscoded as falls, 24% as overexertion, 1% as bitten dog, and 1% as striking stationary object; 5% had no E-code (not shown in figure 1). Missing or misclassified bicycling injuries did not exceed 4% (figure 1).
For the IND coder (who independently coded all bike and pedestrian injuries by ICD-9-CM and ICD-10-CA) there were no missing E-codes for pedestrian injuries; however, approximately 63% of the records were misclassified. Of these, 58% were misclassified as falls for both ICD-9-CM and ICD-10-CA codes. The IND coder had 1.1% missing codes (in ICD-10 CA) for bicycle injuries and 2.2% and 1.7% misclassification for ICD-9-CM and ICD-10-CA, respectively (figure 2).
This study evaluated the reliability of ED coders in E-coding bicycle injuries in three Canadian EDs. We also studied the validity of ICD-9-CM and ICD-10-CA in E-coding bicycle injuries, using pedestrian E-coding as a comparison group. The results showed that agreement for E-coding bicycle injuries was consistently high before and after transition from ICD-9-CM to ICD-10-CA between IND/ED coders and medical charts. Reviewing documented information, the IND coder was able to assign relevant E-codes for bicycle injuries. The limited differences between ED coders and the IND coder have demonstrated the high quality of bicycle injury E-coding, supporting conclusions drawn from bicycle studies using hospital and ED administrative databases.
The difference between ICD-9 and ICD-10 for external causes of injuries among pedestrians and bicyclists is mostly related to the method of defining each code. In ICD-9 the letter ‘E’ at the beginning of codes is an indication of external cause of injury followed by three digits that specify both external cause of injury and the circumstances of the injury (eg, E801=railway involving collision with other object); decimals will specify whether the injured person is a pedestrian or bicyclist. In ICD-10 the letter ‘V’ and the first digit will specify the external cause of injury and the injured person (V0 for pedestrian and V1 for bicyclist) and the second digit will specify the external cause of injury (eg, 5=pedestrian injured in collision with railway or railway vehicle). The decimal in ICD-10 is used for specifying circumstances of the injury event (e.g., traffic or non-traffic related).7 9 We have shown in validity analysis by the IND coder that ICD-10 is a valid classification for E-coding all bicycle and pedestrian injuries as is ICD-9.
Five studies in the USA5 17 19–21 and one in New Zealand18 reported reliability of ICD-9-CM E-coding for injured patients. Exact code agreements between an IND coder and hospital nosologists were reported to be 55.6–82%. One study in Australia23 and one in New Zealand24 reported 67.6% and 71%, respectively, correct E-coding in ICD-10. A systematic review (including five studies) also demonstrated that the range of accurate E-coding in hospital records was between 65% (exact code agreement) and 85% (agreement for broader groups of codes).38
Our study found different results for E-coding of a comparison population of pedestrian injuries. Despite the low reliability of pedestrian injury coding E-coding between the IND and ED coders (table 2), both demonstrated many cases (over 50%) with incorrect E-codes (table 1, pedestrian injuries). The differences between all coders and the reference standard (medical charts) demonstrate the poor quality of pedestrian injury coding, and call into question conclusions drawn from any hospital and ED administrative databases examining pedestrian injuries.
Studying the accuracy of E-coding for work-related and non-work-related injuries in Massachusetts ED data, Hunt et al demonstrated that machinery injuries were misclassified in many cases (65%) to other external cause of injuries such as cut/pierce, struck by/against, falls, overexertion, and missing.17 In the same study it was shown that there was misclassification for coded as not-specified (54%), not-elsewhere classified (31%), other specified (19%), natural/environmental (17%), fall (11%), fire/burn (11%), poisoning (9%), struck by/against (9%), cut/pierce (3%), transportation (1%), and overexertion (1%). Overall, all causes were misclassified 14% of the time to other groups of injuries.17
In another study, the percentage error in the 5th digit location for E-coding was between 2% (for homicide/assault) and 15% (for medical injury).5 Another source of inconsistency between original and independent auditor codes appeared to be due to missing E-codes making up between 14%21 and 20%20 of injury cases. We investigated the many missing E-codes for pedestrian injuries and found that in only 3.3% (pre-transition to ICD-10) and 3.4% (post-transition to ICD-10) of the time did ED coders forget to E-code for pedestrian injuries; however, between 59% and 69% of pedestrian injuries were misclassified, mostly as falls. The IND coder had also miscoded a substantial proportion of pedestrian injuries as falls.
As suggested by other studies,5 17–19 21 24 we also realise that E-coding for pedestrian injuries needs to be emphasised in the nosologist training programmes in order to reduce the number of misclassified cases. As fall was the main source of misclassification, we would suggest that a detailed search for location of the falls must be considered an important piece of information for pedestrian E-coding. It is quite likely that other mechanisms of injury would be subject to the same level of misclassification as our pedestrian injuries (e.g., struck by or against, falls in non-pedestrian settings) and further work in this area is required. Sensitivity of E-coding for bicycle injuries was high for both ED and IND coders, and due to misclassification, was lower for E-coding of pedestrian injuries. ED coders and our IND coder demonstrated a high degree of agreement regardless of E-coding for bicycle or pedestrian injuries (table 2).
Limitations and strengths
Our study was not without limitations. Since we only conducted our study in three hospital EDs in Edmonton, our results may not be generalisable to other locations. In our study, the reference standard was developed through chart review by research staff and confirmed by clinical nurses. The decisions were made long after the discharge occurred and could not be validated further; however, we believe that we selected an unbiased group of both cyclists and pedestrians for coder review. Although it seems very unlikely, we may have missed some patients in the EDIS if ED research assistant failed to use appropriate keywords to identify bicycle or pedestrian injuries; however, missing cases does not affect the validity and reliability of our study. Since in our analysis we have only used pedestrians as the comparison group for bicyclists, it may be argued that we would overestimate levels of agreement because of the limited range of other non-cyclist choices. However, we suspect our choice of pedestrians as a comparator group will have led to a conservative estimate in that it would be hard to distinguish this group from cyclists. If we had chosen less similar mechanisms (e.g., farm injuries or motor vehicle injuries) as our comparison group, it is quite likely that the agreement would have been higher.
We selected cases from one year coded by ICD-9-CM (2001) and two years coded by ICD-10-CA. Unlike other coding studies, we focused only on one external cause of injury (bicycle) and we examined a similarly vulnerable road user group (pedestrian). This is very helpful to make sure that using administrative data to study all bike-related injuries is reliable. Concurrently, validity of IDC-9-CM and ICD-10-CA was evaluated for E-coding of these two types of traffic-related injuries. Finding more than 50% misclassified E-codes for pedestrian injuries initiated a post-hoc investigation, showing that pedestrian injuries were often miscoded as falls. In our analysis we emphasised κ rather simple percentage agreement to test reliability of ED coder. This is also the first study in Canada evaluating the reliability of coders and the validity of the ICD-9-CM and ICD-10-CA systems for bicycle and pedestrian injuries. We selected our cases from the months of summer that included more cases of bicycle or pedestrian injuries, and used a random selection method from a pool of bicycle and pedestrian injuries.
This study shows that ED coders are reliable in E-coding bicycle injuries using ICD-9-CM and ICD-10-CA systems. ICD-10-CA and ICD-9-CM are valid classification tools in capturing bicycle injuries presenting to the ED. Pedestrian injuries, however, may be miscoded as falls, and this needs to be considered when examining ICD coded administrative data on these vulnerable road users. These results have important implications for injury surveillance research.
What is already known on the subject
In both ICD-9 and ICD-10, external cause of injury codes (E-codes) have been widely used for morbidity and mortality surveillance.
Current studies suggest that reliability of E-coding injured patients for ICD-9 ranged from 56 to 82% and for ICD-10 ranged from 68 to 71%.
Misclassification does occur in E-coding of specific groups of injuries, such as work-related injuries in approximately 14% of cases.
What this study adds
This study demonstrated that sensitivity of both emergency department (ED) and independent coders was >90% for bicycle injuries but lower than 50% for pedestrian injuries.
The majority of misclassification of E-coding pedestrian injuries was recorded as fall.
Although bicycle injuries were E-coded with high accuracy, unexpected high misclassification in pedestrian injuries for both emergency department and independent coders highlights the importance of validity and reliability examination of E-coding before drawing valid conclusions in surveillance studies.
The authors greatly appreciate the help of Anne Wiersma for her assistance in coding charts, and Debbie Boyko for her assistance in data extraction from charts.
Appendix 1 Definition of bicycle and pedestrian injuries used by investigators for identifying cases from Emergency Department Information System (EDIS)
The bicycle injury should have happened in a public area (e.g. streets, highways, parks, bicycle pathways, commuter route). They included:
Bicycle rider or passenger hit by a motor-vehicle (including motorcycle, moped and other motorized vehicle)
Bicycle had a collision with another bicycle
Bicycle hit a stationary object
Bicycle hit a moving object or being hit by that moving object (e.g. train, animal)
Bicyclist fell off the bike
Rider on a unicycle
Bicyclists on a reclined bicycle or reclined tricycle
Bicyclists on a tandem bicycle
Pedestrian is a person who shares the road or commuting route with other road users (motorized or non-motorized vehicles) or commuters in public places (e.g. parks streets, bicycle pathways, residential pathways). This is a public road way use, but non-motorized and non-wheeled transportation. They included:
A person hit or run over by a motorized vehicle (including motorcycle, moped and others) in roadways
A person hit or run over by a non-motorized vehicle (e.g. bicycle, tricycle, scooter and other) in roadways
A person injured on a roadway while walking (e.g. tripped over curb or fell over tree root)
A person walking on the roadway and injured due to hitting light poles, trapping in a hole, and hit by a loose object of traffic control device.
A person injured while walking on one side of the bicycle (not riding it) in a roadway
Funding BEH holds the Alberta Children's Hospital Foundation Professorship in Child Health and Wellness, funded through the support of an anonymous donor and the Canadian National Railway Company, as well as the Alberta Heritage Foundation for Medical Research Population Health Investigator and Canadian Institutes of Health Research New Investigator Awards, Alberta, Canada. BHR is supported by the Government of Canada as a 21st Century Canada Research Chair in Emergency Medicine.
Competing interests None declared.
Ethics approval This study was conducted with the approval of the University of Alberta Health Research Ethics Board.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.