Safe Model-Based Off-Policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles

Deep Reinforcement Learning (DRL) has recently been applied to eco-driving to intelligently reduce fuel consumption and travel time. While previous studies synthesize simulators and model-free DRL (MFDRL), this work proposes a Safe Off-policy Model-Based Reinforcement Learning (SMORL) algorithm for eco-driving. SMORL integrates three key components, namely a computationally efficient model-based trajectory optimizer, a value function learned off-policy and a learned safe set. The advantages over the existing literature are three-fold. First, the combination of off-policy learning and the use of a physics-based model improves the sample efficiency. Second, the training does not require any extrinsic rewarding mechanism for constraint satisfaction. Third, the feasibility of trajectory is guaranteed by using a safe set approximated by deep generative models. The performance of SMORL is benchmarked over 100 trips against a baseline controller representing human drivers, a non-learning-based optimal controller, a previously designed MFDRL strategy, and the wait-and-see optimal solution. In simulation, SMORL reduces the fuel consumption by more than 21% while keeping the average speed comparable while compared to the baseline controller and demonstrates a better fuel economy while driving faster compared to the MFDRL agent and the non-learning-based optimal controller.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01856158
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Aug 26 2022 2:55PM