Article Text
Abstract
This research explores the crash severity for the population of crashes resulting in injuries and/or fatalities. Real world crash data were collected from the Portuguese Police Republican National Guard records for the Porto metropolitan area, for the period 2006–2011.
The goal of this study is the development of a crash-severity prediction model with application to crash analysis and prevention. The target is a binary variable, FatalSIK (=1 for severe crashes, 0 otherwise). In this paper, the effect of vehicle characteristics, such as weight, engine size, wheelbase and registration year (age of vehicle) were analysed with data mining methodology in order to extract patterns from the predictors and relate them to the occurrence of injuries and fatalities in a crash.
A predictive approach using Classification and Regression Trees (CART) was adopted. To conduct the analysis with high unbalanced data, oversampling followed by correction of prior probabilities for the original crash population was applied. For all crashes, the Weight of the Vehicle was the most important variable for tree split, and those crashes involving weight≥1743.5 kg were the most severe, (χ2 p value<0.1382). For two vehicle crashes, weight was also the most important variable, followed by Speed. The highest percentage of severe crashes occurred in collisions involving heavier vehicles and in routes with higher speed, (χ2 p value<0.0017). For the single vehicle crashes, the highest percentage of severe crashes involving a vehicles with a larger engine size (ccV1≥1588 cm3) and those that were older (AgeV1≥4.5), (χ2 p value<0.056).