Due to the difficulty of obtaining traffic flow data and conflicts simultaneously, conflict-based analysis using macroscopic traffic features is much less studied. This research aims to analyze real-time safety by a disaggregate study and explore the benefit of the connected vehicle (CV) for real-time safety evaluation. To avoid the endogeneity problem regarding conflicts and traffic features in regression models, machine learning is employed to obtain a reliable and practical real-time safety model. The results show that the Random Forest outperforms eXtreme Gradient Boosting, Support Vector Machine and Adaptive Boosting models, achieving the best performance with the highest AUC of 0.827. For a deep understanding of conflict mechanisms, the explainable machine learning method SHAP (SHapley Additive exPlanation) is introduced to improve the model interpretability providing insights into the impacts of traffic flow features. Lane difference regarding average speed is found to have the most significant impacts on real-time safety. Speed variation, the proportion of trucks and traffic volume are associated with conflict occurrence. Further analysis highlights that the impacts of traffic features are heterogeneous and there may exist specific patterns of paired features affecting real-time safety. Encouragingly, SHAP appears to be able to complement the traditional model with random components in terms of revealing heterogeneity. The explainable machine learning can also provide a solid basis for discretizing continuous variables while previous studies perform discretization mainly based on prior knowledge and experience. The experimental result regarding CV Market Penetration Rate (CV-MPR) demonstrates that the model performance is gradually elevated with the increase of penetration rate. The initial stage of the CV market (20%, 40% CV-MPR) yields the most significant gains in real-time safety evaluation. These findings can be used beneficially in active traffic management.
Keywords: Connected vehicles; Machine learning; Market penetration rate; Real-time safety; SHAP.
Copyright © 2022 Elsevier Ltd. All rights reserved.