Identifying Themes in Railroad Equipment Accidents Using Text Mining and Text Visualization

Developments in text mining now allow useful information to be automatically extracted from text. The Federal Railroad Administration (FRA) publishes a database of railroad equipment accidents. These accident records contain numeric data describing the accident and a text description of the accident. This paper will discuss how latent Dirichlet analysis (LDA), a text-mining algorithm, can be used to identify major recurring accident topics from the text in the FRA reports. Equipment accident reports from 2005 to 2015 were studied. This analysis identified railroad grade crossing accidents with large trucks, shoving accidents, and hump yard accidents as major topics in the accident reports. An alternative method of analyzing the text, text clustering, was also used to study the FRA data. Visualizations of the text also provide useful information about the major types of railroad accidents.


  • English

Media Info

  • Media Type: Web
  • Features: References;
  • Pagination: pp 531-537
  • Monograph Title: International Conference on Transportation and Development 2016: Projects and Practices for Prosperity

Subject/Index Terms

Filing Info

  • Accession Number: 01604031
  • Record Type: Publication
  • ISBN: 9780784479926
  • Files: TRIS, ASCE
  • Created Date: Jun 20 2016 3:03PM