This study combined crash reports, land use, real-time traffic, and weather data to form an integrated database to analyze the severity of crashes taking place on rural highways. As the traffic cameras are placed at fixed locations, there is a wide range of measured distances between crashes and the selected nearest camera for extracting traffic variables. This may change the significance of traffic variables. For the first time, spacing was introduced as the distance around the detectors in which traffic characteristics are inferred to crashes. Classification and Regression Tree (CART) was employed as an interpretable tool to explore how spacing affects model performance and the significance of traffic variables. Twelve spacing scenarios from 250 to 3000 m were evaluated. Except for short spacings suffering from the low sample size issue, each model has a good predictive performance based on overall accuracy and F2 score in the 1000-3000 m spacings. In this range, three dominant rules emerged: (1) high deviations of speed on the roads surrounded by wastelands are associated with severe crashes; (2) faded markings in residential zones increase the likelihood of severe outcomes; (3) installation of barriers decrease the probability of severe crashes. Comparing the Variable Importance Measure (VIM) reveals that the total importance of traffic variables reduces as the spacing increases. Also, results indicate that average speed is significant until 1750 m; but speed deviation, traffic flow, and percent of heavy vehicles are more stable variables for further spacings. In conclusion, for the first time, spacing scenarios were evaluated systematically and proved that they have a remarkable impact on the significance of variables. This novel research provides guidance not only on the spacing but also on which real-time traffic variables have a greater impact on crash severity, along with design, land use, and environmental variables.
© 2022. The Author(s).