Improving Transportation Performance Measurement via Open "Big Data" Systems – Phase 1 Transit

Within public transportation, data-driven metrics are fundamental to an agency’s ability to properly plan and manage the resources within their network. Past efforts to analyze the performance of these systems have been hampered by a lack of centralized real-time data archives. Issues such as difficult data acquisition, varying data formats, and limited transferability and generalization prevent new tools from being deployed at other transit agencies. This project’s goal is to lay the foundation for a “big data” centralized repository supporting the dynamic, ongoing archival of real-time multimodal information to assist practitioners, researchers, and students in better understanding the transportation network across multiple regional geographic areas. The authors introduce a novel system for archiving, retrieval, and use of real-time and scheduled public transit data, which can serve as a foundation for performance assessment, big-data analysis and machine learning applications. By leveraging standardized data formats and new software technologies, these tasks can be performed across multiple agencies concurrently. Archiving multiple transit agencies simultaneously allows researchers to compare approaches with different agencies without needing to alter their approaches for a specific dataset. Deployment metrics after eight months of data archiving and estimated deployment costs in the long term are presented for the initial seven testing datasets. The system is made available as an open-source software project to encourage active collaboration with other practitioners and operations personnel. Further collaborations between researchers and analysts are supported via centralized infrastructure where researchers can retrieve archived data and corroborate techniques across several datasets. Doing so will directly contribute to measuring the efficiency and quality of public transportation. In addition, the authors demonstrate how existing tools for on-time performance measurement and accessibility analysis can be modified to use archived real-time data from the system. Results from these analyses could be leveraged by transit agencies, researchers, and metropolitan planning organizations to better understand the difference between scheduled and actual transit service and how this impacts riders and the public transportation system.


  • English

Media Info

  • Media Type: Digital/other
  • Edition: Final Report
  • Features: Figures; References; Tables;
  • Pagination: 37p

Subject/Index Terms

Filing Info

  • Accession Number: 01727573
  • Record Type: Publication
  • Contract Numbers: CTEDD 017-07
  • Created Date: Jan 9 2020 11:26AM