A Hybrid of Deep Reinforcement Learning and Local Search for the Vehicle Routing Problems

Different variants of the Vehicle Routing Problem (VRP) have been studied for decades. State-of-the-art methods based on local search have been developed for VRPs, while still facing problems of slow running time and poor solution quality in the case of large problem size. To overcome these problems, the authors first propose a novel deep reinforcement learning (DRL) model, which is composed of an actor, an adaptive critic and a routing simulator. The actor, based on the attention mechanism, is designed to generate routing strategies. The adaptive critic is devised to change the network structure adaptively, in order to accelerate the convergence rate and improve the solution quality during training. The routing simulator is developed to provide graph information and reward with the actor and adaptive critic. Then, they combine this DRL model with a local search method to further improve the solution quality. The output of the DRL model can serve as the initial solution for the following local search method, from where the final solution of the VRP is obtained. Tested on three datasets with customer points of 20, 50 and 100 respectively, experimental results demonstrate that the DRL model alone finds better solutions compared to construction algorithms and previous DRL approaches, while enabling a 5- to 40-fold speedup. The authors also observe that combining the DRL model with various local search methods yields excellent solutions at a superior generation speed, comparing to that of other initial solutions.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01787293
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Nov 5 2021 11:54AM