Safe Reinforcement Learning-based Driving Policy Design for Autonomous Vehicles on Highways

Hung Duy Nguyen, Kyoungseok Han

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Safe decision-making strategy of autonomous vehicles (AVs) plays a critical role in avoiding accidents. This study develops a safe reinforcement learning (safe-RL)-based driving policy for AVs on highways. The hierarchical framework is considered for the proposed safe-RL, where an upper layer executes a safe exploration-exploitation by modifying the exploring process of the ε-greedy algorithm, and a lower layer utilizes a finite state machine (FSM) approach to establish the safe conditions for state transitions. The proposed safe-RL-based driving policy improves the vehicle’s safe driving ability using a Q-table that stores the values corresponding to each action state. Moreover, owing to the trade-off between the ε-greedy values and safe distance threshold, the simulation results demonstrate the superior performance of the proposed approach compared to other alternative RL approaches, such as the ε-greedy Q-learning (GQL) and decaying ε-greedy Q-learning (DGQL), in an uncertain traffic environment. This study’s contributions are twofold: it improves the autonomous vehicle’s exploration-exploitation and safe driving ability while utilizing the advantages of FSM when surrounding cars are inside safe-driving zones, and it analyzes the impact of safe-RL parameters in exploring the environment safely.

Original languageEnglish
Pages (from-to)4098-4110
Number of pages13
JournalInternational Journal of Control, Automation and Systems
Volume21
Issue number12
DOIs
StatePublished - Dec 2023

Keywords

  • Autonomous vehicles
  • collision avoidance
  • decision-making
  • finite state machine
  • safe reinforcement learning

Fingerprint

Dive into the research topics of 'Safe Reinforcement Learning-based Driving Policy Design for Autonomous Vehicles on Highways'. Together they form a unique fingerprint.

Cite this