A reinforcement learning framework for the adaptive routing problem in stochastic time-dependent network

Most previous work in addressing the adaptive routing problem in stochastic and time-dependent (STD) network has been focusing on developing parametric models to reflect the network dynamics and designing efficient algorithms to solve these models. However, strong assumptions need to be made in the models and some algorithms also suffer from the curse of dimensionality. In this paper, the authors examine the application of Reinforcement Learning as a non-parametric model-free method to solve the problem. Both the online Q learning method for discrete state space and the offline fitted Q iteration algorithm for continuous state space are discussed. With a small case study on a mid-sized network, the authors demonstrate the significant advantages of using Reinforcement Learning to solve for the optimal routing policy over traditional stochastic dynamic programming method. And the fitted Q iteration algorithm combined with tree-based function approximation is shown to outperform other methods especially during peak demand periods.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01679730
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Aug 30 2018 9:43AM