Article Text

Download PDFPDF
455 Using machine learning to predict child active transportation prevalence and injury rates
  1. Tate HubkaRao1,2,
  2. Marie-Soleil Cloutier3,
  3. Alberto Nettel-Aguirre4,2,
  4. Brent Hagel1,2,5,6,7
  1. 1Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Canada
  2. 2Department of Pediatrics, Cumming School of Medicine, University of Calgary, Calgary, Canada
  3. 3Institut national de la recherche scientifique, Centre Urbanisation Culture Société, Montreal, Canada
  4. 4Centre for Health and Social Analytics, National Institute for Applied Statistics Research Australia, University of Wollongong, Wollongong, Australia
  5. 5Sport Injury Prevention Research Centre, Faculty of Kinesiology, University of Calgary, Calgary, Canada
  6. 6Alberta Children’s Research Institute, University of Calgary, Calgary, Canada
  7. 7O’Brien Institute for Public Health, University of Calgary, Calgary, Canada


Background Motor-vehicle collisions (MVCs) are a leading cause of child active transportation (AT) injuries in Canada. Efforts to improve built environment (BE) safety often occur following an incident or are based on citizen complaints. Machine learning (ML) models may be able to uniquely address the complexity within the road system. As such, developing a ML algorithm that can predict injury rates and AT prevalence can provide municipalities with an important tool to help prevent child injuries and improve child AT.

Methods The Canadian CHASE spatial database, which includes population demographics, BE, school, transportation, and MVC data will be used. Data were collected in five Canadian municipalities containing over 8,000 police-reported child (age 1–17) bicyclist and pedestrian-MVC injuries. Injury and AT data aggregated by Dissemination Area will be further aggregated at community level. The dataset will be split into training (80%) and validation (20%) datasets, with models trained through 10-fold cross-validation. Prediction accuracy measures will be estimated for both regression models and regression trees.

Results Models with highest prediction accuracy on validation data will be used to predict the prevalence of child AT and rates of child MVCs per community per year, based on the BE characteristics of the community.

Conclusion This study is setting up to develop ML models useful in predicting child AT prevalence and MVC injury rates in communities across Canadian municipalities, providing additional tools to efficiently allocate road safety resources. Improving effectiveness of BE interventions may reduce MVC injury rates while improving AT prevalence in Canadian children.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.