Evaluating the Predictive Power of an SPF for Two-Lane Rural Roads with Random Parameters on Out-of-Sample Observations

Negative binomial (NB) regression is among the most common statistical modeling methods used to model crash frequencies due to its simple functional form and ability to handle over-dispersion commonly found in crash data. However, a drawback of this approach is that regression parameters are assumed to be the same across observations, which could contribute to biased parameter estimates. To alleviate this concern, the random parameters negative binomial (RPNB) model was recently proposed, which allows regression parameters to differ across observations following some known distribution. The resulting coefficients should be less biased, and thus the RPNB approach is believed to provide a more accurate relationship between independent variables and expected crash frequency. However, the prediction accuracy of the RPNB model relative to the standard NB model has not been thoroughly evaluated, particularly with respect to out-of-sample observations for which unique random parameters cannot be estimated. In this paper, the predictive power of the RPNB and NB models are examined using two-lane rural highway data from three engineering Districts in Pennsylvania. Multiple evaluation metrics are applied—root-mean-square error (RMSE) and mean absolute error (MAE), coefficients from calibration functions and cumulative residual (CURE) plots—to assess each model type. The results show that the RPNB model outperforms the NB model when applied to within sample observations (i.e., those used to estimate the model) by making use of the observation-specific coefficients. However, the predictive power of the RPNB model appears to be similar to or slightly less precise than the traditional NB model when applied to out-of-sample observations. Since the RPNB model is estimated using a simulation-based approach, sensitivity tests were also performed to see how the parameter estimates change with the number of Halton draws used to perform the simulation. For the sample sizes used in this paper, the estimates were fairly insensitive when more than 50 Halton draws were used. The findings suggest that the RPNB model is more reliable when applied to the same set of sites that were used to estimate the model but might not be as robust as the traditional NB model when applied to other sites.


  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01717000
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Aug 28 2019 3:04PM