Article Text

Download PDFPDF

Estimating bias from loss to follow-up in a prospective cohort study of bicycle crash injuries
  1. Sandar Tin Tin,
  2. Alistair Woodward,
  3. Shanthi Ameratunga
  1. Section of Epidemiology and Biostatistics, School of Population Health, University of Auckland, Auckland, New Zealand
  1. Correspondece to Dr Sandar Tin Tin, Section of Epidemiology and Biostatistics, School of Population Health, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand; s.tintin{at}auckland.ac.nz

Abstract

Background Loss to follow-up, if related to exposures, confounders and outcomes of interest, may bias association estimates. We estimated the magnitude and direction of such bias in a prospective cohort study of crash injury among cyclists.

Methods The Taupo Bicycle Study involved 2590 adult cyclists recruited from New Zealand's largest cycling event in 2006 and followed over a median period of 4.6 years through linkage to four administrative databases. We resurveyed the participants in 2009 and excluded three participants who died prior to the resurvey. We compared baseline characteristics and crash outcomes of the baseline (2006) and follow-up (those who responded in 2009) cohorts by ratios of relative frequencies and estimated potential bias from loss to follow-up on seven exposure-outcome associations of interest by ratios of HRs.

Results Of the 2587 cyclists in the baseline cohort, 1526 (60%) responded to the follow-up survey. The responders were older, more educated and more socioeconomically advantaged. They were more experienced cyclists who often rode in a bunch, off-road or in the dark, but were less likely to engage in other risky cycling behaviours. Additionally, they experienced bicycle crashes more frequently during follow-up. The selection bias ranged between −10% and +9% for selected associations.

Conclusions Loss to follow-up was differential by demographic, cycling and behavioural risk characteristics as well as crash outcomes, but did not substantially bias association estimates of primary research interest.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

By explicitly incorporating the passage of time, prospective cohort studies overcome methodological limitations of many other observational designs, and have arguably provided more credible evidence for decision making.1 Nevertheless, non-response to baseline and follow-up surveys may occur and if related to exposures, confounders and outcomes of interest, may bias association estimates.2

Survey non-response rarely occurs at random and has been associated with sociodemographic factors as well as health outcomes.3–5 Reassuringly, several prospective studies have found bias on association estimates due to initial non-response to be minimal.6–10 This may be because exposure information is collected in advance of outcomes of interest. Drop out during follow-up appears more of a concern, but several studies suggest that its impact on associations is modest.11–19 That is, association estimates differ slightly in the retained sample compared to the full cohort.

While the prospective cohort design has been employed frequently in injury research,20 little is known about the magnitude and direction of bias from loss to follow-up. The Taupo Bicycle Study is a prospective cohort study of cyclists designed to examine factors associated with regular cycling and injury risk. Research questions of primary interest include injury risks associated with cycle commuting, bunch riding, inconspicuity, distraction and previous crash experience. Outcome data were collected for all participants through linkage to four administrative databases. The participants were also resurveyed 3 years after the study commenced, but only three-fifths responded. Loss to follow-up will not be an issue in analyses using exposures measured at baseline in this study but may impact analyses using those measured in the resurvey. We therefore investigated the impact of (simulated) loss to follow-up on seven relevant exposure-outcome associations of primary research interest.

Materials and methods

Design, setting and participants

The sampling frame comprised cyclists, aged 16 years and over, who enrolled online in the Lake Taupo Cycle Challenge, New Zealand's largest mass cycling event held each November and attracting about 10 000 cyclists. Participants have varying degrees of cycling experience ranging from competitive sports cyclists and experienced social riders to relative novices of all ages.

We recruited the majority of participants at the time of the 2006 event for the majority of participants, as described in detail elsewhere.21 Briefly, we sent email invitations containing a hyperlink to an information page describing the study, to 5653 contestants who provided their email addresses at registration for the event. Those who agreed to take part in the study were taken to the next page containing a web questionnaire and asked about demographic characteristics, general cycling activity and previous crash experience in the preceding year and habitual risk behaviours. A total of 2438 cyclists completed and submitted the questionnaire (43.1% response rate). We recruited another 190 cyclists from the 2008 event by including a short description about the study in the event newsletter. We resurveyed all participants in December 2009 using a web questionnaire containing similar questions as in the baseline survey. A total of 1537 participants completed the questionnaire. We obtained ethical approval from the University of Auckland Human Participants’ Ethics Committee.

For this analysis, we restricted the study sample to 2590 participants who were resident in New Zealand at recruitment as the crash outcome data for the overseas participants were not available. We also excluded three deaths that occurred prior to the resurvey. As a result, there were 2587 participants in the baseline cohort, of whom 1526 responded to the second questionnaire administered in 2009. We termed the latter group the ‘follow-up cohort’ and its members the ‘responders’, assuming that outcome data were not available from those who did not complete the second questionnaire. Figure 1 presents the flow of study participation and losses.

Figure 1

Flowchart of recruitment and loss to follow-up in the Taupo Bicycle Study.

Crash outcome data

We collected crash outcome data through record linkage to insurance claims, hospital discharge and mortality data and police reports, covering the period from the date of recruitment to 30 June 2011. All participants consented to link their data to these databases.

Insurance claims

In New Zealand, the Accident Compensation Corporation (ACC) provides personal injury cover for all residents and temporary visitors to New Zealand no matter who is at fault. The claims database is a major source of information on relatively minor injuries with over 80% of the claims relating to primary care (eg, general practitioners, emergency room treatment) only.22 We obtained approval for record linkage from the ACC research ethics committee.

Hospital discharge and mortality data

The hospital discharge data contains information about inpatients and day patients discharged after a minimum stay of 3 h from all public hospitals and over 90% of private hospitals in New Zealand.23 The mortality data includes information about all deaths registered in the country.24 Diagnoses in each hospital visit and underlying causes of death are coded under ICD-10-AM. We identified bicycle crashes using the E codes V10-V19; those that occurred on public roads using the E codes V10-V18.3-9, V19.4-6, V19.9; and those that involved a collision with a motor vehicle using the E codes V12-V14, V19.0-2 and V19.4-6. We then identified readmissions as described previously25 and excluded them.

Police reports

In New Zealand, it is mandatory that any fatal or injury crash involving a collision with a motor vehicle on a public road be reported to the police. This database, therefore, contains information on all police-reported bicycle collisions.

For each participant, we matched bicycle crashes identified across different databases based on the date of crash allowing for a two-day difference, so as to avoid double-counting of the same crash.

Analyses

We presented baseline characteristics and crash outcomes of the baseline and follow-up cohorts by relative frequencies (RFs) and made comparisons by ratios of relative frequencies (RRFs) which were calculated as: RFfollow-up/RFbaseline.

We investigated potential bias from loss to follow-up on seven exposure-outcome associations. These include: associations between previous crash experience, bunch riding, listening to music and using lights in the dark and the risk of all bicycle crashes, and associations between cycle commuting, using fluorescent colours and using reflective materials in the dark and the risk of on-road crashes. For each exposure-outcome association, we calculated individual cell follow-up response rates and cross-products, using the dichotomous outcome (one or more crashes vs no crash). The cross-products are equivalent to ratios of unadjusted ORs (RORs) which were calculated as ORfollow-up/ORbaseline.

As more than a single crash may be experienced during follow-up, we performed Cox proportional hazards regression modelling for repeated events using a counting process approach to examine the associations in the baseline and follow-up cohorts. We adjusted HRs for all demographic and cycling characteristics at baseline. All baseline data were complete for 2435 participants (94.0%). Assuming that the data was missing at random, we computed missing values using multiple imputation (PROC MI) with 25 complete datasets created by the Markov chain Monte Carlo method26 incorporating all baseline covariates (ie, all demographic and cycling characteristics presented in tables 1 and 2, respectively) and the number of crash outcomes. We presented crude and adjusted HRs and estimated the magnitude and direction of bias by ratios of HRs (RHRs) which were calculated as HRfollow-up/HRbaseline.

Table 1

Demographic characteristics of the baseline and follow-up cohorts

Table 2

Cycling characteristics of the baseline and follow-up cohorts

Accounting for the interdependency between the two cohorts, we computed CIs using a non-parametric bootstrapping method with 2000 resamplings (with replacement) of the baseline cohort. We calculated the ln(RHR) in each replicate as βfollow-up−βbaseline and calculated bias-corrected ln(RHR) estimates as 2×ln(RHR)observed−mean(ln(RHR)replicates) as described previously.17 Around each bias-corrected estimate, we constructed 95% confidence limits by using the SD of the ln(RHR)replicates.27 We used SAS V.9.2 (SAS Institute, Cary, North Carolina, USA) for all analyses.

Results

The follow-up cohort (n=1526) constituted 60.0% of the baseline cohort (n=2587).

Baseline characteristics

Groups that were over-represented in the follow-up cohort include cyclists aged over 50 years, non-Ma¯ori and university graduates (table 1). The responders were more likely to have a normal Body Mass Index, and more likely to reside in urban areas and in least deprived neighbourhoods.

Additionally, there were differences in some cycling characteristics (table 2). The responders were more often experienced cyclists and bunch riders. Protective behaviours, such as use of conspicuity aids were more prevalent in the follow-up cohort, whereas, distracting behaviours, such as listening to music, were less prevalent. Although not significant, the responders were more likely to ride off-road and in the dark.

Bicycle crash outcomes

The follow-up cohort consistently experienced more bicycle crashes throughout the follow-up period (figure 2 and table 3).

Figure 2

Incidence of bicycle crashes experienced during follow-up.

Table 3

Crash outcomes of the baseline and follow-up cohorts

Exposure-outcome associations

Individual cell follow-up response rates show how selection bias could occur in selected associations (table 4). Response was not differential in general, but the less balanced distribution of exposures in the non-crash group resulted in slightly biased estimates. This was evident when individually comparing bunch riding, listening to music and using lights with the risk of all crashes. The cross-products, that is, crude RORs, were similar to crude RHRs (where repeated crash events were taken into account) presented in table 5.

Table 4

Individual cell follow-up response rates for selected exposures and outcomes

Table 5

Crude and adjusted HRs in the baseline and follow-up cohorts

Crude RHR estimates ranged from 0.88 (95% CI 0.78 to 0.98) in the bunch riding-all crashes association to 1.14 (95% CI 1.00 to 1.29) in the listening to music-all crashes association (table 5). When we adjusted for all baseline covariates, the HRs changed markedly but in a similar direction in the two cohorts, resulting in modest changes in their relative sizes. Adjusted RHRs ranged between 0.90 and 1.09. We observed the largest positive bias in the listening to music-all crashes and using fluorescent colours-on-road crashes associations, and it was away from the null. We found the largest negative bias in the using lights-all crashes, and using reflective materials-on-road crashes associations, and it was away from the null. Similar results were observed in complete case analysis (ie, restricted to 2435 participants with complete data).

Discussion

Main findings

In this prospective cohort study involving 2587 cyclists, 60% responded to a questionnaire administered 3 years after establishment of the study. Failure to respond was associated with demographic, cycling and behavioural risk characteristics as well as crash outcomes. However, the selection bias relating to the seven associations of interest appeared to be small with the adjusted RHRs ranging between 0.90 and 1.09.

Strengths and limitations

This is one of the very few prospective cohort studies involving cyclists. Baseline data were collected in advance of the crash outcomes and were complete for almost all participants, as mandatory fields and validation checks were incorporated in the web questionnaire. Data on crash outcomes were collected from four administrative databases (including the government-funded universal no-fault injury compensation claims database) and, therefore, were available for all participants in the baseline cohort and covered injuries across the spectrum of severity. This provided us a unique opportunity to estimate bias from (simulated) loss to follow-up in the injury field.

This analysis, however, excludes very minor crashes not requiring either medical or police attention, which represents approximately 70% of self-reported crashes in this study.28 Ascertainment of crash outcome data may also be affected by personal, social and health service factors29 as well as the quality of individual data sources and record linkage.28 Self-reported exposure data may not be accurate and may change over time. Nevertheless, potential misclassifications of crash outcomes and exposures are likely to be non-differential2 and resulted in underestimation of association estimates in our previous analysis.30 Our participants are not representative of all New Zealand cyclists; however, this may have minimal impact on the association estimates.31 Moreover, our participants represented a wide variation with regard to demographics, cycling exposure and experience. Finally, our analysis was limited to seven exposure-outcome associations of primary interest to this study.

Interpretation

In this study, the response rate to a follow-up survey was 60%. If attrition does not depend on exposures, confounders and outcomes (ie, missing completely at random) or depends on exposures and confounders but not on outcomes (ie, missing at random), there is no evidence of serious bias with up to 60% of attrition, according to one study.32 But if otherwise, even low levels of attrition (20% or less) can bias association estimates.

As with many other studies,3 ,5 loss to follow-up did not occur at random in this cohort. The follow-up responders were older, more likely to be university educated, and more likely to reside in urban areas and in least deprived neighbourhoods. This socioeconomic disparity may be attributed to the web-based data collection used in the study; however, similar findings have been reported from other cohorts, regardless of the method used to collect data.12–14 17–19 33–35 Other factors, such as participants’ IQ scores, cognitive functioning and personality characteristics, may also influence follow-up response as reported previously,35–37 but we were not able to assess these variables in our cohort.

Cycling characteristics also predicted follow-up response. The responders were more experienced cyclists and often rode in a bunch, off-road or in the dark. It is plausible that cycling enthusiasts are more willing to continue participation in research on cycling and its safety. Our findings showed that the responders were less likely to engage in risk behaviours such as not using conspicuity aids and listening to music while riding. This is consistent with previous research, although unrelated to cycling, showing that lifestyle risk factors, such as smoking,9 ,16 ,17 ,38 ,39 alcohol abuse,12 ,40 physical inactivity16 ,40 ,41 and poor diet,40 are more common in non-responders or late responders.

Additionally, bicycle crash outcomes differed by response status. The responders were more likely to experience crashes during follow-up. By contrast, other studies not concerned with injury have reported poorer health outcomes and higher mortality among non-responders11–15 ,17 ,33 ,42 although this was not always the case.39

Loss to follow-up, however, caused only modest bias in the selected associations. The magnitude of the bias may depend on the strength of associations of exposures and outcomes to attrition, and also on whether such associations are direct or via other common causes.43 In this study, the largest bias in crude estimates appeared to be due to unbalanced distribution of exposures among the follow-up participants who did not experience a crash, but the bias was attenuated after adjustment for baseline covariates. This may be because most of the covariates predicted follow-up response and were also related to exposure and/or outcome, and adjustment for common causes (or their proxies) will attenuate the bias.2 ,44 ,45 For example, in the case of M-bias where follow-up response is influenced by factors that also determine exposure and/or outcome, adjustment for those factors will block the back-door pathway opened up by conditioning on follow-up response. In the adjusted estimates, the bias was modest and ranged between −10% and +9%, suggesting that effect measures to be estimated in our future analyses based on exposures measured in the resurvey may not be substantially biased. This is also in accordance with the findings from various cohorts in fields other than injury.11–19

Conclusions

Loss to follow-up was systematic and differential by demographic, cycling and behavioural risk characteristics as well as crash outcomes. This overestimated the incidence of bicycle crashes but did not substantially bias association estimates. The findings are reassuring, but strictly speaking, apply only to selected associations in the Taupo Bicycle Study. Attempts should be made to minimise attrition and to estimate associated biases in any prospective study.

What is already known on the subject

  • Loss to follow-up, if related to exposures, confounders and outcomes of interest, may bias association estimates.

  • Several studies have attempted to estimate the magnitude and direction of such bias but rarely involved cohorts followed-up for injury outcomes.

What this study adds

  • We investigated bias from loss to follow-up in a prospective cohort study involving 2590 New Zealand cyclists.

  • Loss to follow-up was differential by demographic, cycling and behavioural risk characteristics as well as crash outcomes.

  • Yet, loss to follow-up did not substantially bias seven association estimates of primary research interest.

Acknowledgments

We thank the participating cyclists and organisers of the Lake Taupo Cycle Challenge for their support, and Professor John Langley, Professor Anthony Rodgers and Dr Simon Thornley for their initial contribution to the study. Our thanks also go to the Accident Compensation Corporation, Ministry of Health and New Zealand Transport Agency for the provision of bicycle crash data, and to Mr Roy Miodini Nilsen from the University of Bergen, Norway and Ms Silvia Stringhini from the Institute of Social and Preventive Medicine, Lausanne University Hospital, Switzerland, for their kind assistance with bootstrap analyses.

References

Footnotes

  • Contributors STT contributed to the conception and design of the study, acquisition, analysis and interpretation of data and drafting of the manuscript. AW and SA contributed to the conception and design of the study, interpretation of data and revision of the manuscript. All authors read and approved the final manuscript.

  • Funding This work was supported by the Health Research Council of New Zealand (grant number 09/142).

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval University of Auckland Human Participants’ Ethics Committee.

  • Provenance and peer review Not commissioned; externally peer reviewed.