A Data-Driven Method for Trip Ends Identification Using Large-Scale Smartphone-Based GPS Tracking Data

Using tracking data obtained from the smartphone and Internet survey, a data-driven machine learning method is proposed to identify trip ends. In previous literature, this is usually done based on some predefined rules, which have been confirmed to be valid. Nonetheless, these rule-based methods largely depend on researchers’ own knowledge, which is inevitably subjective and arbitrary. Moreover, they are not effective enough to process the huge amount of data in the era of big data. In this paper, millions of smartphone-based GPS tracking data are targeted. A group of attributes, such as travel speed, distance, and heading, are derived to characterize the smartphone holders’ travel status. In other words, the tracking points could be identified as being at the state of traveling or non-traveling, based on which the trip ends are easily detected. In contrast to those rule-based methods, a random forest is utilized in this paper as the classification model, with no subjective rules predefined for classification. This data-driven model is automatically built. The results show that after training the GPS tracking data of 1393 days and the prompted recall (PR) survey data using the random forest, the accuracy of trip ends identification on tracking data of 697 days is 96.17%. The current analysis is free from personal experiences, which is expected to be useful for the smartphone-based survey data in the era of big data.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01644791
  • Record Type: Publication
  • Files: TLIB, TRIS
  • Created Date: Aug 3 2017 11:59AM