A Feature Learning Approach Based on XGBoost for Driving Assessment and Risk Prediction

This study designs a framework of feature extraction and selection, to assess vehicle driving and predict risk levels. The framework integrates learning-based feature selection, unsupervised risk rating, and imbalanced data resampling. For each vehicle, about 1300 driving behavior features are extracted from trajectory data, which produce in-depth and multi-view measures on behaviors. To estimate the risk potentials of vehicles in driving, unsupervised data labelling is proposed. Based on extracted risk indicator features, vehicles are clustered into various groups labelled with graded risk levels. Data under-sampling of the safe group is performed to reduce the risk-safe class imbalance. Afterwards, the linkages between behavior features and corresponding risk levels are built using XGBoost, and key features are identified according to feature importance ranking and recursive elimination. The risk levels of vehicles in driving are predicted based on key features selected. As a case study, NGSIM trajectory data are used in which four risk levels are clustered by Fuzzy C-means, 64 key behavior features are identified, and an overall accuracy of 89% is achieved for behavior-based risk prediction. Findings show that this approach is effective and reliable to identify important features for driving assessment, and achieve an accurate prediction of risk levels.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01711131
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Jun 4 2019 3:05PM