Enhancing Car-Following Performance in Traffic Oscillations Using Expert Demonstration Reinforcement Learning

Deep reinforcement learning (DRL) algorithms often face challenges in achieving stability and efficiency due to significant policy gradient variance and inaccurate reward function estimation in complex scenarios. This study addresses these issues in the context of multi-objective car-following control tasks with time lag in traffic oscillations. The authors propose an expert demonstration reinforcement learning (EDRL) approach that aims to stabilize training, accelerate learning, and enhance car-following performance. The key idea is to leverage expert demonstrations, which represent superior car-following control experiences, to improve the DRL policy. The authors' method involves two sequential steps. In the first step, expert demonstrations are obtained during offline pretraining by utilizing prior traffic knowledge, including car-following trajectories from an empirical database and classic car-following models. In the second step, expert demonstrations are obtained during online training, where the agent interacts with the car-following environment. The EDRL agents are trained through supervised regression on the expert demonstrations using the behavioral cloning technique. Experimental results conducted in various traffic oscillation scenarios demonstrate that the authors' proposed method significantly enhances training stability, learning speed, and rewards compared to baseline algorithms.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01936084
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Nov 7 2024 9:21AM