A Comparison of Statistical and Machine Learning Algorithms for Predicting Rents in the San Francisco Bay Area

Urban transportation and land use models have used theory and statistical modeling methods to develop model systems that are useful in planning applications. Machine learning methods have been considered too ’black box’, lacking interpretability, and their use has been limited within the land use and transportation modeling literature. The authors present a use case in which predictive accuracy is of primary importance, and compare the use of random forest regression to multiple regression using ordinary least squares, to predict rents per square foot in the San Francisco Bay Area using a large volume of rental listings scraped from the Craigslist website. The authors find that they are able to obtain useful predictions from both models using almost exclusively local accessibility variables, though the predictive accuracy of the random forest model is substantially higher.

  • Supplemental Notes:
    • This paper was sponsored by TRB committee ADD30 Standing Committee on Transportation and Land Development.
  • Corporate Authors:

    Transportation Research Board

  • Authors:
    • Waddell, Paul
    • Besharati-Zadeh, Arezoo
  • Conference:
  • Date: 2019


  • English

Media Info

  • Media Type: Digital/other
  • Features: Figures; References; Tables;
  • Pagination: 15p

Subject/Index Terms

Filing Info

  • Accession Number: 01697824
  • Record Type: Publication
  • Report/Paper Numbers: 19-05881
  • Files: TRIS, TRB, ATRI
  • Created Date: Dec 7 2018 9:38AM