Article Text
Abstract
Background Motor-vehicle collisions (MVCs) are a leading cause of child active transportation (AT) injuries in Canada. Efforts to improve built environment (BE) safety often occur following an incident or are based on citizen complaints. Machine learning (ML) models may be able to uniquely address the complexity within the road system. As such, developing a ML algorithm that can predict injury rates and AT prevalence can provide municipalities with an important tool to help prevent child injuries and improve child AT.
Methods The Canadian CHASE spatial database, which includes population demographics, BE, school, transportation, and MVC data will be used. Data were collected in five Canadian municipalities containing over 8,000 police-reported child (age 1–17) bicyclist and pedestrian-MVC injuries. Injury and AT data aggregated by Dissemination Area will be further aggregated at community level. The dataset will be split into training (80%) and validation (20%) datasets, with models trained through 10-fold cross-validation. Prediction accuracy measures will be estimated for both regression models and regression trees.
Results Models with highest prediction accuracy on validation data will be used to predict the prevalence of child AT and rates of child MVCs per community per year, based on the BE characteristics of the community.
Conclusion This study is setting up to develop ML models useful in predicting child AT prevalence and MVC injury rates in communities across Canadian municipalities, providing additional tools to efficiently allocate road safety resources. Improving effectiveness of BE interventions may reduce MVC injury rates while improving AT prevalence in Canadian children.