Application of Machine Learning Techniques to Predict the Occurrence of Distraction-affected Crashes with Phone-Use Data
Distraction occurs when a driver’s attention is diverted from driving to a secondary task. The number of distraction-affected crashes has been increasing in recent years. Accurately predicting distraction-affected crashes is critical for roadway agencies to reduce distracted driving behaviors and distraction-affected crashes. Recently, more and more emerging phone-use data and machine learning techniques are available to safety researchers, and can potentially improve the prediction of distraction-affected crashes. Therefore, this study first examines if phone-use events provide essential information for distraction-affected crashes. The authors apply the machine learning technique (i.e., XGBoost) under two scenarios, with and without phone-use events, and compare their performances with two conventional statistical models: logistic regression model and mixed-effects logistic regression model. The comparison demonstrates the superiority of XGBoost over logistic regression with a high-dimensional unbalanced dataset. Further, this study implements SHAP (SHapley Additive exPlanation) to interpret the results and analyze the importance of individual features related to distraction-affected crashes and tests its ability to improve prediction accuracy. The trained XGBoost model achieves a sensitivity of 91.59%, a specificity of 85.92%, and 88.72% accuracy. The XGBoost and SHAP results suggest that: (1) phone-use information is an important factor associated with the occurrences of distraction-affected crashes; (2) distraction-affected crashes are more likely to occur on roadway segments with higher exposure (i.e., length and traffic volume), unevenness of traffic flow condition, or with medium truck volume.
- Record URL:
-
Availability:
- Find a library where document is available. Order URL: http://worldcat.org/issn/03611981
-
Supplemental Notes:
- Chaolun Ma https://orcid.org/0000-0002-8773-2831 © National Academy of Sciences: Transportation Research Board 2021.
-
Authors:
- Ma, Chaolun
- 0000-0002-8773-2831
- Peng, Yongxin
- 0000-0002-9212-5366
- Wu, Lingtao
- 0000-0003-2337-7145
- Guo, Xiaoyu
- 0000-0002-0401-5723
- Wang, Xiubin
- Kong, Xiaoqiang
- 0000-0002-8120-0754
- Publication Date: 2022-2
Language
- English
Media Info
- Media Type: Web
- Features: References;
- Pagination: pp 692-705
-
Serial:
- Transportation Research Record: Journal of the Transportation Research Board
- Volume: 2676
- Issue Number: 2
- Publisher: Sage Publications, Incorporated
- ISSN: 0361-1981
- EISSN: 2169-4052
- Serial URL: http://journals.sagepub.com/home/trr
Subject/Index Terms
- TRT Terms: Cellular telephones; Distraction; Logistic regression analysis; Machine learning; Traffic crashes
- Identifier Terms: eXtreme Gradient Boosting (XGB) algorithm
- Subject Areas: Highways; Safety and Human Factors;
Filing Info
- Accession Number: 01784951
- Record Type: Publication
- Files: TRIS, TRB, ATRI
- Created Date: Oct 18 2021 11:23PM