A novel framework for crash frequency prediction: Geographic support vector regression based on agent-based activity models in Greater Melbourne

Accid Anal Prev. 2024 Nov:207:107747. doi: 10.1016/j.aap.2024.107747. Epub 2024 Aug 19.

Abstract

The field of spatial analysis in traffic crash studies can often enhance predictive performance by addressing the inherent spatial dependence and heterogeneity in crash data. This research introduces the Geographical Support Vector Regression (GSVR) framework, which incorporates generated distance matrices, to assess spatial variations and evaluate the influence of a wide range of factors, including traffic, infrastructure, socio-demographic, travel demand, and land use, on the incidence of total and fatal-or-serious injury (FSI) crashes across Greater Melbourne's zones. Utilizing data from the Melbourne Activity-Based Model (MABM), the study examines 50 indicators related to peak hour traffic and various commuting modes, offering a detailed analysis of the multifaceted factors affecting road safety. The study shows that active transportation modes such as walking and cycling emerge as significant indicators, reflecting a disparity in safety that heightens the vulnerability of these road users. In contrast, car commuting, while a consistent factor in crash risks, has a comparatively lower impact, pointing to an inherent imbalance in the road environment. This could be interpreted as an unequal distribution of risk and safety measures among different types of road users, where the infrastructure and policies may not adequately address the needs and vulnerabilities of pedestrians and cyclists compared to those of car drivers. Public transportation generally offers safer travel, yet associated risks near train stations and tram stops in city center areas cannot be overlooked. Tram stops profoundly affect total crashes in these areas, while intersection counts more significantly impact FSI crashes in the broader metropolitan area. The study also uncovers the contrasting roles of land use mix in influencing FSI versus total crashes. The proposed framework presents an approach for dynamically extracting distance matrices of varying sizes tailored to the specific dataset, providing a fresh method to incorporate spatial impacts into the development of machine learning models. Additionally, the framework extends a feature selection technique to enhance machine learning models that typically lack comprehensive feature selection capabilities.

Keywords: Activity-based modeling; Crash frequency prediction; Feature Selection; Geographical support vector regression; Machine learning; Variable importance.

MeSH terms

  • Accidents, Traffic* / prevention & control
  • Accidents, Traffic* / statistics & numerical data
  • Automobile Driving / statistics & numerical data
  • Bicycling* / injuries
  • Bicycling* / statistics & numerical data
  • Humans
  • Pedestrians / statistics & numerical data
  • Safety
  • Spatial Analysis
  • Support Vector Machine
  • Systems Analysis
  • Transportation / statistics & numerical data
  • Victoria / epidemiology
  • Walking* / injuries
  • Walking* / statistics & numerical data