Motion control of unmanned underwater vehicles via deep imitation reinforcement learning algorithm

In this study, a motion control algorithm based on deep imitation reinforcement learning is proposed for the unmanned underwater vehicles (UUVs). The algorithm is called imitation learning (IL) twin delay deep deterministic policy gradient (DDPG) (TD3). It combines IL with DDPG (TD3). In order to accelerate the training process of reinforcement learning, the supervised learning method is used in IL for behaviour cloning from the closed-loop control data. The deep reinforcement learning employs actor-critic architecture. The actor part executes the control strategy and the critic part evaluates current control strategy. The training efficiency of IL-TD3 is compared with DDPG and TD3. The simulation results show that the training results of IL-TD3 converge faster and the training process is more stable than both of them, the convergence rate of IL-TD3 algorithm during training is about double that of DDPG and TD3. The control performance via IL-TD3 is superior to PID in UUVs motion control tasks. The average track error of IL-TD3 is reduced by 70% than PID control. The average tracking error under thruster fault is almost the same as under normal condition.


  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01748219
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Aug 5 2020 4:19PM