Multilevel Dirichlet Process Mixture Analysis of Railway Grade Crossing Crash Data

This article introduces a flexible Bayesian semiparametric approach to analyzing crash data that are of hierarchical or multilevel nature. The authors extend the traditional varying intercept (random effects) multilevel model by relaxing its standard parametric distributional assumption. While accounting for unobserved cross-group heterogeneity in the data through intercept, the proposed method allows identifying latent subpopulations (and consequently outliers) in data based on a Dirichlet process mixture. It also allows estimating the number of latent subpopulations using an elegant mathematical structure instead of prespecifying this number arbitrarily as in conventional latent class or finite mixture models. In this paper, the authors evaluate our method on two recent railway grade crossing crash datasets, at province and municipality levels, from Canada for the years 2008–2013. The authors use cross-validation predictive densities and pseudo-Bayes factor for Bayesian model selection. While confirming the need for a multilevel modeling approach for both datasets, the results reveal the inadequacy of the standard parametric assumption in the varying intercept model for the municipality-level dataset. In fact, the authors proposed method is shown to improve model fitting significantly for the latter data. In a fully probabilistic framework, the authors also identify the expected number of latent clusters that share similar unidentified features among Canadian provinces and municipalities. It is possible thus to further investigate the reasons for such similarities and dissimilarities. This can have important policy implications for various safety management programs.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01595580
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Mar 18 2016 4:06PM