Article Text

Download PDFPDF

Retrospective baseline measurement of self-reported health status and health-related quality of life versus population norms in the evaluation of post-injury losses
  1. W L Watson1,
  2. J Ozanne-Smith1,
  3. J Richardson2
  1. 1Monash University Accident Research Centre, Melbourne, Victoria, Australia
  2. 2Centre for Health Economics, Monash University, Melbourne, Victoria, Australia
  1. Correspondence to:
 Dr W Watson
 Monash University Accident Research Centre, Building 70, Monash University, Melbourne, Victoria 3800, Australia; wendy.watson{at}general.monash.edu.au

Abstract

Background: Owing to the difficulty in prospectively measuring pre-injury health status and health-related quality of life (HRQL) in an injured cohort, population norms or retrospective baseline scores are often used as comparators for evaluating post-injury losses. However, there has been little discussion in the literature or research into the soundness of these approaches for this purpose.

Objectives: To investigate the appropriateness of the retrospectively measured baseline health status and HRQL in an injured population for the purpose of evaluating post-injury losses.

Methods: A cohort of injured admitted to hospital (n = 186) was followed up for 12 months after injury. Retrospectively measured pre-injury health status and HRQL scores were compared with those at 12 months after injury for participants who reported complete recovery (n = 61) and those who did not. Retrospective baseline scores for the whole cohort were also compared with Australian population norms.

Results: For participants who completely recovered, no significant difference was observed between scores at baseline (measured retrospectively) and those at 12 months after injury (36-item Short Form Questionnaire physical component summary z = −1.274, p = 0.203; 36-item Short Form Questionnaire mental component summary z = −1.634, p = 0.102; Short Form 6 Dimensions: z = −1.405, p = 0.296). A borderline significant difference was observed in HRQL as measured by the Assessment of Quality of Life (z = −1.970, p = 0.049). Retrospectively measured pre-injury scores were consistently higher than Australian norms for all measures.

Conclusions: The injured population may not be representative of the general population. Consequently, retrospective baseline measurement of pre-injury health states may be more appropriate than general population norms for the purpose of evaluating post-injury losses in this population.

  • AQoL, Assessment of Quality of Life
  • FCI, Functional Capacity Index
  • HRQL, health-related quality of life
  • SF-36, 36-item Short Form Questionnaire
  • SF-6D, Short Form 6 Dimensions

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

With the increasing international focus on quantifying the non-fatal outcomes of injury, both at the population and individual levels, there is an important need to account for pre-injury health status when attributing post-injury outcomes to injury. The measurement of self-reported baseline health status and health-related quality of life (HRQL) is an essential component of any clinical trial or cost-effectiveness study, providing a yardstick for quantifying change. In clinical studies where the effects of treatment are being assessed, the prospective measurement of baseline states is rarely problematic. However, in situations where the effect of a health condition or injury is to be quantified, the prospective measurement of baseline health status and HRQL is rarely possible, as participants cannot usually be identified until after the event of interest (eg, the injury) has occurred.

Consequently, retrospective measurement, which relies on the recollection of pre-injury states, is often used to establish the baseline in injury outcomes research.1,2 This is considered justified if assessment is made soon after injury, especially if measuring physical and role function. However, its validity in measuring psychosocial and cognitive function is less certain. Alternatively, age-specific and sex-specific population norms have also been suggested to provide a reference point for comparison with post-injury outcomes.3

Although both approaches offer obvious practical advantages over the prospective measurement of baseline pre-injury health states, there has been little discussion in the literature regarding the relative merit, or the implications, of using either approach to evaluate post-injury losses. Research directed at establishing the validity of the retrospective measurement of self-reported health status or HRQL is also lacking. This is not surprising, as a definitive answer to this question can only be provided by a large prospective population-based cohort study (with injury as one of the outcomes), where baseline health can be assessed both prospectively (pre-injury) and retrospectively (post-injury). Owing to the difficulties inherent in designing a purposive study to examine this issue, our analysis uses a novel method to make use of existing data to discuss aspects of this problem.

OBJECTIVES

This study aimed to investigate the appropriateness of retrospective measurement of self-reported baseline (pre-injury) health status and HRQL, in an injured population, to evaluate post-injury losses. The main hypothesis is that, if the retrospective measurement of baseline health is valid, pre-injury values should approximate those measured at 12 months after injury in injured patients who report complete recovery.

METHODS

This analysis was based on longitudinal outcome measures from a prospective cohort study of injured patients. The general methods have been described elsewhere.4 Ethical approval for the study was obtained from the ethics committees responsible for each hospital and from the Monash University Standing Committee on Ethical Research in Humans.

Study population

Four major metropolitan teaching hospitals in Victoria, Australia, participated in the study, with participants being recruited between April and September 2002. Eligible participants were identified in the admissions register of participating hospitals as having sustained an injury, and being aged 18–74 years. Patients with a major head injury and neurological deficit were excluded because of possible difficulties with follow-up. Although this may have skewed the sample towards less severe injuries, some oversampling of patients who stayed longer was expected (because of more opportunity for recruitment), ensuring a broad range of injury types and severity.

Measures

The health status of participants was measured by using the 36-item Short Form Questionnaire (SF-36).5 The SF-36 generates a health profile of eight dimensions that can be combined into two summary component scores that measure physical and mental function. HRQL was measured by using two generic utility-based measures: the Short Form 6 Dimensions (SF-6D),6 derived from the SF-36, and the Assessment of Quality of Life (AQoL) instrument.7 Both instruments generate a total utility score for the health states defined, which are anchored on 0 (equivalent to death) and 1 (full health).

The SF-6D uses 11 of the 36 SF-36 items to form six subscales: physical function, role limitations, social function, bodily pain, mental health and vitality. Although the SF-6D is yet to be completely validated, several published studies examine the instrument’s performance relative to other utility-based measures, and it has been shown to produce higher values than the Health Utility Index V.2, the EuroQol-5 Dimensions and the AQoL.8,9,10,11 With scores compressed towards the upper half of the scale (range 0.3–1.0), it is more sensitive to smaller changes at the higher end of the scale than the EuroQol-5 Dimensions, the Health Utility Index V.2 and the AQoL,8,12 and does not seem to differentiate well between more severe health states.8,10,12 The validity of the SF-6D in this cohort will be examined in a forthcoming paper.

The AQoL7 comprises 15 items that are combined into four dimensions: independent living, social relationships, physical senses and psychological well-being. Despite its recent development, the AQoL has undergone extensive trials and validation. It has been used in >50 studies13 and seems to perform as well as other well-known preference-based measures of HRQL.8 Validation of the AQoL in this cohort is described in detail elsewhere.4

Timing of retrospective data collection

Retrospective measurement of pre-injury health status and HRQL was undertaken as soon as practicable after patient recruitment, ideally while participants were still hospitalized.

Statistical analysis

The validity of retrospectively measured baseline scores was examined by comparing these values with (1) 12-month post-injury scores of participants who completely recovered and (2) Australian norms. Non-parametric statistical tests were used where possible because of the non-normal distribution of outcome values.

Comparison with 12-month post-injury scores

If the retrospective measurement of health status and HRQL is valid, these scores should be similar to the 12-month post-injury scores of participants who completely recovered. Complete recovery was defined as a return to full health, with no residual functional limitations. This was operationalized using the health transition scale and the Functional Capacity Index (FCI).

The health transition scale (SF-36 Question 2) measures change in health compared with a prior state (eg, 12 months ago or pre-injury). It is a five-point ordinal scale with responses: 1, much better than pre-injury; 2, somewhat better than pre-injury; 3 about the same; 4 somewhat worse than pre-injury; 5 much worse than pre-injury. The scale was collapsed into a binary measure to distinguish participants who recovered (1–3, “same or better health” than pre-injury) and those who did not (4 and 5, “worse health” than pre-injury).

The FCI instrument14,15 was used to discriminate between participants with and without functional limitations, resulting from injury, at the end point of the study. The FCI was designed specifically to measure the effect of non-fatal injuries on everyday activities and has been validated in an injured population.15 It describes function in 10 dimensions of physical function, yielding dimension-specific and whole-body scores expressed in the range 0 (maximum limitation) to 1 (no limitations). FCI scores were collapsed into a binary measure representing no residual limitations (FCI = 1) or residual limitation (FCI<1).

For this analysis, two main groups were distinguished at 12 months after injury: (1) completely recovered (reporting same or better health than pre-injury and no residual impairments) and (2) not completely recovered (reporting worse health than pre-injury or with residual limitations).

If the retrospective baseline measures are valid, participants not recovered at 12 months after injury should have worse health status and HRQL scores than at baseline. Those fully recovered should show no major difference between scores. Wilcoxon’s signed ranks tests were conducted to identify any major difference between the retrospective baseline and 12-month post-injury scores for each group.

Comparison with Australian norms

Australian norms for the SF-36 were available from the Australian Bureau of Statistics.16 Published norms for the AQoL17 were not commensurate with age groups commonly used in injury research, hence, recalculated norms were provided by the developers.18 Norms for the SF-6D were calculated from the Confidentialised Unit Record File of the 1995 National Health Survey provided by the Australian Bureau of Statistics.19 The SF-36 responses from the file were scored to the SF-6D, and means established for relevant age groups. Responses were scored for 13 791 respondents, aged ⩾18-years, with group numbers ranging from 517 (⩾75 years) to 3192 (35–44 years).

Single-sample t tests were used to compare the cohort scores with population norms, on the assumption that the population value was the real value (given the generally low standard errors for population norms) and that observations represent the variance of the study data from the population value.20,21

RESULTS

Participants’ characteristics

Data were available for both the retrospective pre-injury and 12-month post-injury administrations for 186 of the 221 participants recruited. The continuing cohort group represented slightly >9% of the estimated equivalent population from the participating hospitals (injured patients admitted to hospital who met the inclusion criteriai during the recruitment period estimated from the Victorian Admitted Episodes Dataset). Demographic characteristics of the cohort did not differ substantially from the population from which it was drawn, showing a similar age distribution and a slightly higher proportion of men. However, there were differences in injury mechanism and nature of injury (table 1), but not in severity of injury as measured by using the International Classification of Diseases Injury Severity Score22 (single-sample Wilcoxon’s signed rank sum test z = 0.264, p = 0.795).

Table 1

 Injury characteristics of continuing study cohort by recovery status and equivalent unintentional injury admissions at participating hospitals (Victorian Admitted Episodes Dataset, April–September 2002)

Participants who completely recovered (n = 61) at 12 months after injury were more likely to be men (82%) compared with participants who did not recover (66%), and were also significantly younger (t = 4.42, p<0.001). Participants who completely recovered were also more likely to have been working before the injury (85%) compared with those who did not recover (75%), but less likely to have sustained a compensable injury (47% v 62%). Overall, there was no significant difference between the two groups in baseline health except on the SF-6D (table 2) possibly because of the greater sensitivity of the SF-6D well at the end of the scale. However, as we expected, there was a significant difference in severity of injury between the two groups on both the International Classification of Diseases Injury Severity Score and the New Injury Severity Score (table 2).

Table 2

 Baseline summary statistics and tests of difference for health-related quality of life, health status and severity of injury for recovery groups (n = 186)

Timing of retrospective data collection

Retrospective baseline interviews were conducted as soon as practicable after the participant’s injury to reduce the possibility of recall inaccuracies. The mean administration time was 1 week, and 50% of interviews were completed within 3 days after the injury (mode = 1 day, median = 4 days).

Although the group that did not recover had a longer interval between injury and administration of the retrospective baseline measures (8.9 v 5.5 days), there was no significant difference between these times (t = 1.929, p = 0.055). Most participants in both groups (83%) completed the pre-injury baseline questionnaires while still in hospital.

Comparison with 12-month post-injury scores

Overall, two thirds (n = 125) of participants had not completely recovered at 12 months after injury, reporting either worse health on the health transition scale (n = 66) compared with pre-injury or still experiencing a functional limitation (n = 59). As expected, participants who did not recover had significantly lower scores after injury, on all measures, compared with their pre-injury scores. There was no significant difference in most scores between these time points for the completely recovered group (table 3). The AQoL, however, showed a marginally significant difference in scores on the Wilcoxon’s signed ranks test (p = 0.049) between the retrospectively measured pre-injury baseline and end point of the study. This may be due to the large variance in AQoL scores in one age group (35–44 years), at 12 months after injury, which is inconsistent with scores for other age groups (fig 1), primarily due to a significantly lower social relationship score, a domain not specifically covered by the other instruments.

Table 3

 Summary statistics and tests of difference for utility and 36-item Short Form Questionnaire scores at baseline versus 12 months after injury for participants grouped on recovery status (n = 186)

Figure 1

 Comparison of retrospective baseline and 12-month post-injury health-related quality of life (assessment of quality of life (AQoL) and Short Form 6 Dimensions (SF-6D) utility) scores, with Australian norms for participants who recovered at the end point of the study (n = 61).

Comparison with Australian norms

Figures 1 and 2 show the distribution of scores by age group for the retrospective baseline and 12-month post-injury measurements for participants who completely recovered (n = 61) compared with Australian norms. Despite variations, overall, the retrospective baseline scores for each measure seem more consistent with scores measured prospectively at 12 months after injury, than with Australian norms.

Figure 2

 Comparison of retrospective baseline and 12-month post-injury health status (36-item Short Form Questionnaire (SF-36) physical component summary (PCS) and mental component summary (MCS) scores) with Australian norms for participants who recovered at the end point of the study (n = 61).

As with the completely recovered subset, the retrospectively measured baseline scores for the whole cohort (n = 186) were consistently higher for all age groups than the Australian norms, although the distribution of scores was similar. Single-sample t tests showed that the cohort scores on each measure were significantly higher than the norms for each age group (p<0.05).

DISCUSSION

There has been very little discussion in the literature or scientific examination of the validity of post-injury self-reported estimates of pre-injury health status for the purpose of evaluating post-injury losses. This can only be determined through a large prospective population-based cohort study (with injury as one of the outcomes), where baseline pre-injury health states and utility can be assessed both prospectively (pre-injury) and retrospectively (post-injury). This is a virtually insurmountable methodologic challenge because of the sample size required to ensure the necessary numbers of injuries.2 This study uses empirical data to attempt to discuss this issue.

The results show that retrospectively measured pre-injury health status and HRQL scores of participants who completely recovered were similar to scores at 12 months after injury and that the mean baseline scores of the injured cohort, irrespective of outcome, were consistently higher than the Australian norms. This suggests that either bias was operating or this cohort was healthier and fitter than the general population. In this study, recall bias was not expected to be a significant problem, as participants were interviewed, on average, 1 week after injury, with half completing baseline interviews within 3 days after injury.

The findings suggest that the injured population may not be representative of the general population. This is supported by several studies.1,2,24,25 However, the evidence seems mixed as to whether the injured population is more physically healthy and fitter than the general population or less so. The results of the current study are consistent with the observation that rates of sports-related injury in the older population are increasing.24 Greenspan and Kellerman1 also found that pre-injury physical and general health subscores were better among adults injured by gunshot than established population norms.

However, these findings seem to contradict those reported recently in the only population-based research to compare pre-existing morbidity in a large cohort of injured people (n = 21 032) and a matched sample from the general population.25 In that study, the injured cohort had higher comorbidity scores and almost twice the hospital admissions and physician claims in the previous 12 months than the non-injured cohort. Although HRQL was not explicitly measured, this finding casts doubt on the explanation that higher baseline health status and HRQL scores in our study were because of better health in this group compared with the general population. However, it may simply reflect the fact that disease status (as measured by comorbidity and health service usage) may not correlate well with everyday functioning (as measured by self-reported health status and HQRL).26 In either case, the findings suggest that the injured population may not be representative of the general population.

An alternative explanation for these findings is response shift. Evidence exists that people may not maintain a consistent internal scale for their responses over time and this may be exaggerated by an intervening traumatic event,27 such as injury, leading an individual to re-evaluate their prior health state in light of their current experience. This change in internal standards, in values or in the conceptualization of quality of life is known as response shift.28 Response shift occurs in self-report assessments “whenever the standards for making judgments change between pretest and posttest, resulting in a change in the meaning and understanding of the construct under study”.29

In this case, retrospectively measuring pre-injury health during the acute post-injury phase may result in a higher relative assessment of the pre-injury state than if the measure was taken before the injury. This would explain the significantly higher scores in the cohort compared with the population norms. However, without prospective baseline pre-injury measures for comparison with retrospective baseline measures, it is not possible to assess the role of response shift.

In contrast with the view that recall bias could invalidate a retrospective measure of pre-injury health, there is a strong assumption, in the literature on response shift, that assessments of HRQL derived from a retrospective baseline administration are more valid than prospective assessments,30 for comparison with post-test (post-injury) measures. This is because they are presumed to be completed with the same internal standard of measurement,31 the assessment of the retrospective pre-injury baseline state, and that of subsequent post-injury states, being made with reference to the same information (ie, experience of injury). Response shift could explain why no significant difference was found between the retrospective pre-injury and 12-month post-injury scores on most measures for the completely recovered group and why baseline scores were generally higher than population norms.

Key points

  • The prospective measurement of pre-injury baseline health status and health-related quality of life (HRQL) is rarely an option in injury outcome studies.

  • Alternative methods such as the use of retrospective pre-injury measurement or population norms, as the baseline against which to assess post-injury losses, have received little methodologic attention.

  • Although retrospectively measured pre-injury health status and HRQL were consistently higher than population norms, in an injured cohort, these were consistent with 12-month post-injury scores for participants who made a full recovery.

  • Response shift may operate such that the retrospective measurement of baseline health status and HRQL may be more appropriate than the use of population norms for evaluating post-injury losses.

Although factors such as severity of injury, pre-injury health and compensable status have been shown to influence recovery from injury, the extent to which these factors influence response shift is unknown. Differences in injury severity, for example, may result in the differential effect of response shift between groups. However, because the retrospective pre-injury and post-injury measures are presumed to be completed with the same internal standard of measurement, this should not affect the evaluation of post-injury losses (ie, the difference between the two values) or the generalizability of our findings.

Furthermore, given the evidence that injured populations differ from the general population, retrospectively measured pre-injury health status and HRQL seem preferable to population norms in evaluating post-injury losses. In this study, using age-based population norms as baseline measures would have resulted in participants who completely recovered being assigned considerable health and HRQL gains as a result of their injury, and much reduced losses (and in some cases gains) for participants who did not recover.

CONCLUSION

The apparently paradoxical results of this study can be explained by the response shift theory. Although the effects of response shift may differ within subgroups of the injured population, retrospective baseline measurement of pre-injury health states may still be more appropriate than general population norms for the purpose of evaluating post-injury losses in this population.

Acknowledgments

This research was made possible by the John Lane Memorial Scholarship for PhD research in injury prevention funded by the Monash University Accident Research Foundation. Funding for the recruitment stage of the study was provided by the Monash University Accident Research Foundation and the Monash University Accident Research Centre. Barbara Fox, Belinda Clark and Angela Wallace assisted with recruitment. Angela Clapperton provided assistance with writing SPSS syntax and Stuart Newstead provided statistical advice. Three anonymous reviewers provided invaluable feedback.

REFERENCES

Footnotes

  • i For full details of the inclusion criteria, see Watson et al4

  • Competing interests: None.