Objectives: To determine what proportion of research papers at an injury prevention conference reported an evaluation.
Methods: A random sample of 250 abstracts from the 6th World Conference on Injury Prevention and Control were classified by methodological type. Those that described any evaluation were further subdivided by whether the evaluation was of process or if it used an intermediate or “true” outcome.
Results: Of 250 abstracts, 20 (8%; 95% confidence interval 5.0% to 12.1%) showed evaluations with intermediate or true outcomes. Research designs were weak. Among the 20 reports, none was a randomized trial and only two conducted a before and after study with control group. The remaining 17 used before-after or “after only” designs.
Conclusion: The conference papers included few evaluations. To ensure that resources are best used, those in the injury prevention field must increase their use of rigorous evaluation.
Statistics from Altmetric.com
Many interventions are conducted to improve safety, typically in the belief that it is “obvious” that the intervention will be effective. Yet there are examples of failure of intervention going back several decades,1–5 or even some that apparently have done more harm than good.6 These failures show how important it is to incorporate solid evidence into safety strategies, by conducting proper evaluation of interventions and programmes that are implemented.
To assess if this is occurring, we have classified abstracts from the 6th World Conference on Injury Prevention and Control.7 This is a large international meeting held every two years which attracts a wide range of attendees, from researchers to practitioners to policy makers and government personnel.
We used the abstract book from the 6th World Conference on Injury Prevention and Control. The meeting was held in Montreal in May 2002. We omitted abstracts from “plenary” and “state of the art” sessions, leaving 1079 abstracts, of which we randomly sampled 250. One abstract in our sample described a conference workshop and was omitted. Another was randomly selected to replace it.
One of us (CAS) read all abstracts and classified them according to the type of paper: they ranged from rigorous evaluations to descriptions of programmes to policy discussions. A second reviewer (HSS) independently classified 20 abstracts to assess observer variability. When there was uncertainty about the classification, the authors discussed the abstract and reached agreement.
A classification scheme was developed out of consecutive reviews of the abstracts. A preliminary version was followed by a final version which developed out of discussions between the reviewers about abstracts not falling into one of the prespecified categories. We also classified those abstracts that reported evaluations of interventions according to the type of study, and whether they were process or outcome evaluations or used an intermediate outcome.
There were 227 abstracts in English, 23 in French (but none in Spanish even though this was the third official language of the conference). Sixty four came from the United States, 56 from Canada, and 49 from Europe, which together comprised approximately two thirds of the entire sample.
There was a high degree of consistency between the reviewers. Of the review sample of 20, there was disagreement among reviewers for only two. Revisions were made to the classification scheme based upon these disagreements.
Table 1 shows the classification of abstracts by type. Most abstracts described empirical studies (144/250), with approximately two thirds of these being case series (94/144). For our purposes, case series are broadly defined as studies of records (possibly pre-existing) over some time period, which may or may not involve comparisons between groups (for example, gender, age). Any explicitly defined studies were grouped under a more appropriate heading (for example, analytical), while some studies which were not well defined in the abstract, but may have been analytical, etc, were grouped under the case series category.
Among those abstracts referring to a programme of intervention, 37% (27/73) conducted an evaluation of that programme and, of those, five assessed outcome measures. Fifteen abstracts described evaluations of an intermediate measure. A few of the abstracts made claims about the effectiveness of the programmes being described, but did not provide any data or methodology to support those claims. In sum, then, 20 out of 250 abstracts (8%; 95% confidence interval 5.0% to 12.1%) reported evaluations with intermediate or “true” outcomes.
Table 2 illustrates the variety of evaluation study designs. Before-after (pre-post) measurements were the most common (12/27), while eight did not describe any baseline measures (8/27). Only five abstracts mentioned use of control groups in their studies.
Results show few outcome and intermediate evaluations of safety interventions—there were only 20 out of 250 abstracts. Only 2% (5/250) were true outcome evaluations. Certainly it is reasonable to expect a range of study types, but we believe the mix should be different. The fact that nearly two thirds of the intervention programmes did not mention any sort of evaluative process is indeed worrisome. Principles of evaluation for interventions are well established and, hence, when a programme is set up an evaluation component should be built in. In fairness, it is often difficult to motivate the parties involved to take part in controlled evaluations, especially when they are convinced a priori that the intervention will be beneficial.8
We acknowledge that by looking only at the abstracts, we could not tell just what was in the talk or poster presented. Many abstracts were quite vague, making it difficult to ascertain what sort of study or programme the authors were describing. The sampling frame was limited to abstracts submitted to and accepted by the conference. These may not necessarily be representative of the field—yet this was a large international conference and one expects that those doing solid scientific work in the field would present at it.
Most evaluations of interventions assessed an intermediate measure such as behaviour (15/27), as opposed to a true outcome—actual injuries. This is because the true outcome may be sufficiently rare that relying on it would lead to low statistical power unless a large and lengthy study is conducted. Yet for severe outcomes, evaluators wish to know if the programme has had any effect, so they quite reasonably use an intermediate outcome as an indicator of any change in risk. However, when evaluations, regardless of their type (that is, process, intermediate, or outcome), are conducted we would expect good quality study designs. The near total absence of controlled trials, randomized or otherwise, is indeed an issue of concern for the field of injury prevention research.
Arguably, the potential to waste resources unnecessarily or to cause more harm than good makes it unethical not to evaluate interventions properly. Yet, to judge from the abstracts at the conference, there is insufficient high quality evaluation in the field of injury prevention and control.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.