Article Text


Assessing the potential for bias in direct observation of adult commuter cycling and helmet use
  1. John D Kraemer,
  2. Heather N Zaccaro,
  3. Jason S Roffenbender,
  4. Sabeeh A Baig,
  5. Megan E Graves,
  6. Katherine J Hauler,
  7. Aamir N Hussain,
  8. Faith E Mulroy
  1. Department of Health Systems Administration, Georgetown University, Washington, DC, USA
  1. Correspondence to Professor John D Kraemer, Department of Health Systems Administration, Georgetown University, 3700 Reservoir Road, NW, 231 St Mary's Hall, Washington, DC 20007, USA; jdk32{at}


Objectives Bicycling and helmet surveillance, research, and programme evaluation depend on accurate measurement by direct observation, but it is unclear whether weather and other exogenous factors introduce bias into observed counts of cyclists and helmet use.

Methods To address this issue, a time series was created of cyclists observed at two observation points in Washington, DC, at peak commuting times and locations between September 2012 and February 2013. Using multiple linear regression with Newey-West SEs to account for possible serial correlation, the association between various factors and cyclist counts and helmet use was investigated.

Results The number of cyclists observed per 1 h session was significantly associated with predicted daily high temperature, chance of rain, and actual rain. Additionally, fewer cyclists were observed on Fridays. Helmet use was significantly lower during evening commutes than morning and also lower on Fridays. Helmet use was not associated with weather variables. Controlling for observable cyclists characteristics weakened the association between helmet use and the time of day and day of the week, but it did not eliminate that association.

Conclusions Direct observation to measure commuter cycling trends or evaluate interventions should control for weather and day of week. Measurement of helmet use is unlikely to be meaningfully biased by weather factors, but time of day and day of week should be taken into account. Failing to control for these factors could lead to significant bias in assessments of the level of, and trends in, commuter cycling and helmet use.

Statistics from


Over the last decade, bicycle commuting has risen rapidly in many cities, and cities increasingly encourage cycling and other active transport modalities through enhanced infrastructure, bicycle sharing programmes (including the well-used Capital Bikeshare programme in Washington, DC, USA), and public communication campaigns.1 Shifting commuters to active transport can improve urban quality of life by reducing traffic congestion and air pollution.2 ,3 At the individual level, shifting daily commutes to cycling is associated with net positive health outcomes, wherein a slightly increased injury risk is outweighed by improved cardiovascular health.4 ,5 At the same time, commuters’ cumulative exposure to injury risk may be higher than recreational cyclists because of the frequency with which they ride in congested and unpredictable traffic environments. Helmet use reduces this risk.6 However, almost no data exist on cycling commuters, either with respect to helmets and safety or with respect to methodological considerations for improving and assessing interventions aimed at this population.

Direct observation of bicyclists is often considered to be the preferred means of assessing trends in cycling prevalence as well as helmet use because it is less susceptible to recall errors and social desirability biases than survey-based self-reported helmet use.7 For this reason, many well-designed assessments of the prevalence of helmet use and trends in cycling and helmet use rely on direct observation.8–17 However, direct observation is time and resource-intensive. Therefore, measurement to assess trends in cycling and helmet use, as well as measurement to evaluate helmet or physical activity interventions, often have to come from a relatively small number of observation sessions conducted periodically. This has the potential to leave estimates of cycling frequency and helmet vulnerable to bias from unrecognised factors, potentially harming the validity of research, injury surveillance, and programme evaluation.

Little is known about the extent to which exogenous factors, such as seasonal changes in temperature, affect either cycling rates or helmet use. To the extent information exists, it tends to focus on childhood cycling, which has historically been the focus of interventions.7 However, limited survey data suggest that less pleasant weather (such as precipitation), shorter days, and other conditions deter cycling, thus causing commuters to instead choose alternative modes of transportation. For example, Lusk and colleagues found substantial differences in cycling frequency in Portland and Vancouver by month (highest in the summer and lowest in the winter), day (lowest on weekends), and hour (peaks during the morning and evening rush hour).18 Fuller and colleagues identified significant effects of temperature and precipitation on the likelihood of self-reported cycling in Montreal, including among non-recreational cyclists.19 Nankervis identified mild effects of inclement weather on bicycle commuting by college students in Melbourne.20

By contrast with the existing, albeit sparse, data on cycling rates, there are no published studies on whether similar factors influence helmet use among adult commuters. Customarily, helmet use is measured as the proportion of observed cyclists who are wearing a helmet. If the composition of cyclists changes in correlation with weather, day of the work week, or other factors—for example, if less pleasant weather reduces the number of casual cyclists while more avid riders persevere—there is a potential for observed helmet use also to change. In this instance, failing to control for such factors could bias estimates of helmet use, trends in observed helmet use over time, and inferences about interventions’ effectiveness when before-after evaluations are used. If, on the other hand, these factors only influence the number of observed cyclists without changing their composition, policymakers and researchers can be more confident in estimates of, and trends in, helmet use based on periodic observations as well as straightforward evaluations of interventions.

This study seeks to answer two questions. First, to what extent does the number of observed adult bicycle commuters change as a function of weather, work week, and other factors? Second, to what extent does helmet use change as a function of these same factors, and, if a meaningful change is detected, to what extent is that change mediated by observable changes in cycling populations?


This study used prospective direct observation at two sites in Washington, DC, to form a time series with observation sessions as the unit of analysis. Sites were chosen due to their geographic location as corridors within the city for daily bicycle commuting. Additionally, they were areas not closely connected to recreational riding routes (eg, bike trails) to maximise the likelihood that observed cyclists were commuters. Data were collected over 98 sessions from September 2012 through the end of February 2013. This time frame was chosen to correspond to the period for which school is in session (limiting youth and non-commuter cycling), and to exclude tourist season (which, beginning in March, substantially changes cycling in the city).

Observation sessions were 60 min in length and timed to correspond to rush hour based on District of Columbia Department of Transportation data: 7:45–8:45 and 17:15–18:15. The morning rush hour provided the study's primary data, but evening observations were included for a subset of days at one site to enable comparisons by time of day. Data were not collected on public holidays and in the period surrounding the winter holidays because they substantially alter bicycle commuting patterns. Additionally, while data were normally collected during inclement weather, two sessions were cancelled because of a hurricane.

Observation was conducted by two-person teams, who received training in observation methods prior to their first session. The teams were situated in the middle of the assigned block and had access to covered, heated locations to enable observation during inclement weather. To avoid duplication or missed observations, one observer counted cyclists travelling in one direction and the other counted cyclists travelling in the opposite direction. Data on all observed cyclists—whether riding in the road or on the sidewalk—were collected unless the observer was certain that a cyclist had previously been counted during that observation session. Cyclists under age 18 years were excluded from analysis because their decision making about helmet use is likely influenced by different factors than adult commuting cyclists (such as parental decision making, and the District of Columbia's youth helmet law).21 ,22

Georgetown University's institutional review board determined that human subjects committee oversight was not required for this study.


Two primary outcome variables were assessed: the number of cyclists observed, and the proportion wearing a helmet. Both were assessed by direct observation. Other cyclist characteristics previously found to be associated with helmet use were also assessed by direct observation. These characteristics were cyclists’ sex, age (dichotomised 35 years and under or over 35 years based on physical characteristics), race (black, white, Asian, or other), whether the cyclist was using a Capital Bikeshare bicycle, and whether the cyclist was wearing compression pants or compression shorts (as a proxy for being a more avid cyclist). The data collection instrument was based on one previously found to be reliable under similar observation conditions23 and revisions were piloted twice.

The following exogenous weather variables were collected: temperature, wind chill/heat index, daily high and low temperatures, whether it was raining at the beginning of the session, the daily and hourly rain chance, wind speed, sunrise and sunset times, and a topline weather condition (eg, ‘sunny’). All data were recorded based on Weather Channel online ( forecasts through automatically scheduled emails, with field observers recording the same information as a redundancy. The Weather Channel's online forecasts were used because they are the most commonly used forecast in the USA.24 For all observation periods, actual conditions at the beginning of the observation session, predicted conditions—as of 7:00 the same day—for the day as a whole, and the hour of observation, and predicted conditions as of approximately 23:00 the night before were collected. Additionally, observers recorded whether the road was wet, snowy or icy, and whether there was heavy fog at the beginning of every session. Finally, day of the week, and whether observations were taken during the morning or evening rush hour were also recorded.

Statistical analysis

The approximate normality of variables was assessed visually with histograms. Variables measuring the percent chance of rain were categorised (no chance, less than 50%, and 50% or greater) because of strong non-normality. The per-session count of cyclists was log transformed to improve normality and skedasticity and to enable inclusion of both observation locations—which had different baseline numbers of cyclists observed per session—in the same model.

Models were fit using multiple linear regression. Because there was no a priori reason to believe that certain explanatory variable combinations were more likely to affect cycling behaviour, Akaike's Information Criterion was used to select the combination of explanatory variables that best balanced explanatory power and simplicity. To select weather-related variables, the best combination of variables from each collection approach (eg, actual, night-before, and morning-of-collection) were identified. Then, combinations of the best actual and forecasted weather variables were tested to see if a mix of forecasted and actual weather conditions improved the overall model.

After identifying the best set of explanatory variables, three regression analyses were conducted: (1) the variables’ effect on the number of observed cyclists per session, (2) their effect on the proportion of cyclists wearing helmets and (3) their effect on the proportion of cyclists with each observable personal characteristic. Any personal characteristic significantly associated with one or more explanatory variable was added to the model to see if any changes in helmet use might be explainable by changes in the population of riders. Observation sessions with fewer than 15 cyclists were excluded from helmet use analyses.

Because it is possible that observations at each site are serially correlated, Newey-West SEs, which are robust to serial correlation and heteroskedasticity, were used to a maximum of three lag periods (which corresponds to 1 week of observation). Analyses used Stata V.13.1 (StataCorp, College Station, Texas, USA).


Three thousand eight hundred and eighty-nine observations were made during 98 data collection sessions. One session (with 17 observations) was dropped from the final analysis as an outlier. An additional 16 sessions (with 167 observations) were excluded only from helmet analyses because they had fewer than 15 observations. Observation session characteristics are summarised in table 1.

Table 1

Characteristics of observation sessions

The best model for predicting the number of cyclists observed per session and helmet use included the daily predicted high temperature and percent chance of rain as of the morning of the observation session, whether it actually rained during the observation session, and whether observation occurred on a Friday, as well as a dummy variable for observation location. In the model, each degree increase in the predicted daily high temperature increased the number of cyclists by 2.2% (β=0.021; 95% CI=0.017 to 0.026). Compared with no chance of rain, less than a 50% chance did not significantly alter the number of cyclists (β=−0.087; 95% CI −0.220 to 0.046). However, a precipitation chance of 50% or greater reduced the number of cyclists observed by 40% (β=−0.506; 95% CI −0.821 to −0.191), and actual precipitation further decreased the number of observed cyclists by 28% (β=−0.326; 95% CI −0.646 to −0.006). On Fridays, cyclist counts were 21% lower (β=−0.231; 95% CI −0.412 to −0.050) (see table 2).

Table 2

Change in the observed hourly number of cyclists per unit change in explanatory variables

The percentage of cyclists wearing helmets was 10% lower at evening rush hour than morning rush hour (β=−10.6; 95% CI −14.3 to −6.9), holding observation site constant, and 5% lower on Fridays (β=−5.20; 95% CI −9.24 to −1.17). However, the percentage of cyclists wearing helmets was not significantly associated with any of the weather variables. Among the non-significant weather variables, none would have meaningfully changed the observed percentage of helmet use even if statistically significant; for example, a 25° difference in the daily high temperature would be associated with less than a one-half percent change in helmet use (see table 3).

Table 3

Change in the percentage of cyclists wearing helmets per unit change in explanatory variables

The percentage of cyclists who were over age 35 years, wearing compression pants, using a Bikeshare bicycle, of non-white race, and of female sex, were all, at least marginally, significantly associated with one or more explanatory variable in the model (not shown in data tables). The percentage of cyclists over age 35 years decreased slightly with warmer temperatures (β=−0.182, 95% CI −0.330 to −0.033) and increased on Fridays (β=5.41, 95% CI 3.22 to 10.49). The percentage of cyclists wearing compression pants decreased during the evening rush hour (β=−6.55; 95% CI−9.33 to −3.77), and on Fridays (−5.87; 95% CI −9.03 to −2.71). Non-white cyclists were more common during evening rush hour (β=4.85; 95% CI 2.25 to 7.45). Bikeshare use was marginally significantly less common when it was raining (β=−6.65; 95% CI −13.6 to 0.28), as was female sex, with a >50% chance of rain (β=−2.94; 95% CI −6.43 to 0.53). However, when personal characteristics were singly added to the model, only the percentage of cyclists using Bikeshare bicycles (β=−0.441; 95% CI −0.680 to −0.202) and the percentage wearing compression pants (β=0.324; 95% CI 0.015 to 0.633) were significantly associated with helmet use.

When personal characteristics were added, the best overall model was one that incorporates Bikeshare use and compression pant wearing but excluded the other personal characteristics. In this model (see table 3), the coefficients of all seasonal variables shift closer to the null, and the association between helmet use and Friday observation sessions is rendered non-significant. (The association with evening sessions is weakened by about one-quarter, but remains significant). This suggests that changes in ridership population partially mediates the changes in observed helmet use.


This study has two major implications. First, the number of cyclists is strongly sensitive to external factors, including temperature, precipitation, and day of the week. This suggests that studies assessing the impact of interventions to increase cycling must control for these factors in order for inferences to be valid. While some factors are fairly obvious (such as the effect of rain), seemingly innocuous factors also have the potential to introduce bias. For example, a 10° change in the daily high temperature could be expected to change the number of cyclists by more than 20%, which is larger than the expected effect of many interventions. This finding is consistent with prior research that assessed cycling behaviour by self-report.19 ,20

The second implication is that the percentage of cyclists wearing a helmet is not significantly influenced by weather variables. This suggests that evaluations of helmet interventions that do not control for weather variables are unlikely to be seriously biased. This has practical significance for helmet programme evaluations because trying to schedule observation sessions around largely random, day-to-day variation in weather would greatly complicate study logistics, if it were necessary.

This study did, however, find that morning rush hour cyclists were approximately 5% less likely to wear helmets on Fridays than on other days, and evening rush hour cyclists were approximately 10% less likely to wear helmets than morning rush hour cyclists. The reasons for this are not certain, but it is likely that somewhat different underlying populations were observed in the evenings and on Fridays. Evening rush hour is likely less compressed than its morning equivalent, with some bike commuters who work longer hours likely travelling after the observation period. Alternative work schedules with every second Friday off are fairly common in the District of Columbia, and may partially explain Friday differences.25 Additionally, evening and Friday observation sessions observed a significantly lower percentage of cyclists wearing compression pants—a proxy for particularly avid cyclists, who are disproportionately likely to wear helmets. In any event, helmet use evaluation should not treat different days or times of day as equivalent.


As with all studies of this type, one possible limitation is that other changes may have occurred simultaneously with changes in measured explanatory variables and confound observed associations. For two reasons, however, this is not likely to be a substantial risk. First, observation continued through the heart of the winter into the period when temperatures began to warm. Second, there was significant day-to-day variation in precipitation risk and temperature, so those variables are partially independent of calendar time and, therefore, of secular trends. Because the weather-related variables are exogenous, a reasonably strong causal inference can be made for associations with cycling and helmet use.

This study was not able to assess the effect of very hot temperatures. Temperatures during observation periods ranged from 19° to 83° Fahrenheit. Washington, DC, receives an influx of tourists from March through the summer, which substantially changes the cycling population in the city. As a result, changes in the number of observed cyclists and helmet use during these periods would be subject to substantial bias and difficult to interpret. It seems likely that, above some threshold of discomfort, high temperatures discourage cycling in a manner similar to very cold temperatures. While this is a topic for further investigation, it does not undermine the general finding that temperature significantly influences the number of observed cyclists. Because helmets make riding warmer in the summer, however, further study of the association between temperature and helmet use in very hot weather would be valuable.

The extent to which these findings can be extrapolated to non-commuters is uncertain. This study focused on times and locations likely to capture adult bicycle commuters because they are a growing and poorly understood cycling population. This study does not directly provide information about recreational cyclists or children, though it is plausible that they would be affected by weather and other factors in a manner similar to commuters. On the other hand, these findings are likely generalisable to other cities’ adult commuter populations, though cyclists in very warm or cold settings may be more accustomed to different temperature ranges and the baseline level of helmet use may be lower in some European settings.

Finally, data quality is always a potential concern. In particular, accurate estimation of age (even when broadly categorised) and race can be difficult because of limited observation time for each cyclist. However, the accuracy of estimating these characteristics by direct observation is generally accepted and the direct observation has been previously found to be reliable for cycling using the same methods and categories employed in this study23 and other road safety behaviours.26 Data entry accuracy was checked by visual inspection of every tenth record, and entry errors were rare (0.16%, which corresponds to approximately one error per 600 fields).


Understanding how exogenous factors influence the results of studies using direct observation of cyclists is important for understanding trends in cycling and helmet use over time and for evaluating particular interventions. This study suggests that interpretations of trends in observed numbers of commuter cyclists must be made with care unless weather and work-week factors are carefully controlled. On the other hand, the observed proportion of cyclists wearing helmets is less sensitive to these factors—particularly weather-related factors—so trends in helmet use from observations on fixed days at fixed times need not be as closely controlled. Ultimately, both findings reinforce a fundamental premise of public health surveillance and evaluation: relevant contextual factors must be understood in order for data and trends to be interpreted validly.

What is already known on the subject

  • Converting commuter trips from vehicles to active transport—including cycling—produces net health benefits but may increase injury risk.

  • Bicycle helmets reduce the risk of head and facial injury among those involved in crashes.

  • Direct observation is commonly used to measure helmet use and cycling behaviour, but there are few studies of factors that may bias observed trends in either cycling or helmets, particularly for commuters.

What this study adds

  • The number of observed commuter bicyclists is significantly associated with precipitation, temperature and day of the week, so these factors should be controlled for in studies and evaluations.

  • Commuter cyclists’ helmet use is not significantly associated with weather factors in this study, but it is associated with time of day and day of the week.

  • Assessments of helmet use should take these factors into account to reduce the risk of bias in inferences.


View Abstract


  • Contributors JDK, HNZ, and JSR were responsible for the conceptualisation and design of the study. All authors were responsible for data collection. JDK was responsible for data analysis. JDK drafted the manuscript, and all authors participated in the revision process. All authors approved the final version.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Original data are available from the corresponding author and will be provided upon request.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.