Understanding Ridesplitting Behavior with Interpretable Machine Learning Models: Comparing Trip-level and Community-level Characteristics using Chicago’s Ridesourcing Trips

As congestion levels increase in cities, it is important to analyze people’s choices of different services provided by Transportation Network Companies (TNC). Using machine learning techniques in conjunction with large TNC data, this paper specially focuses on uncovering complex relationships underlying ridesplitting market share. A real-world dataset provided by TNCs in Chicago is used in analyzing ridesourcing trips from November 2018 to December 2019 to understand trends in the city. Aggregated origin-destination trip-level characteristics, such as mean cost, mean time, and travel time reliability, are extracted and combined with origin-destination community-level characteristics. Three tree-based algorithms are then utilized to model the market share of ridesplitting trips. The most significant factors are extracted as well as their marginal effect on ridesplitting behavior, using partial dependency plots. The results suggest that, overall, community-level factors are as or more important than trip-level characteristics. Additionally, the percentage of White people highly affect ridesplitting market share as well as the percentage of bachelor’s degree holders and households with two people residing in them. Finally, the potential impact of taxes, crimes, cultural differences and comfort is discussed in driving the market share and suggestions are presented for future research and data collection attempts.


  • English

Media Info

  • Media Type: Digital/other
  • Features: Figures; Maps; References; Tables;
  • Pagination: 21p

Subject/Index Terms

Filing Info

  • Accession Number: 01763757
  • Record Type: Publication
  • Report/Paper Numbers: TRBAM-21-02248
  • Files: TRIS, TRB, ATRI
  • Created Date: Dec 23 2020 11:10AM