Modified Triplet-Average Deep Deterministic Policy Gradient for interpretable neuro-fuzzy deep reinforcement learning

Research output: Contribution to journalArticlepeer-review

Abstract

In order to find the control rules of the nonlinear system from the learned data, it is necessary to interpret the learned policy in Deep Reinforcement Learning (DRL). This paper presents a novel interpretable Neuro-Fuzzy (NF) inference system based on Modified Triplet-Average Deep Deterministic Policy Gradient (MTADD) reinforcement learning algorithm with a two-phased training method. The first phase involves exploring and initiating the T-S fuzzy system rule and premise parameter. The second step is the deep reinforcement learning of the NF policy network, which uses a Modified Triplet-Average Deep Deterministic policy gradient algorithm. The experiment results demonstrate that the proposed approach decreases the training time, enhances the control performance, and increases the interpretability of NF DRL.

Original languageEnglish
Article number107653
JournalJournal of the Franklin Institute
Volume362
Issue number7
DOIs
StatePublished - 1 May 2025

Keywords

  • Interpretable neuro-fuzzy controller
  • Inverted pendulum
  • Reinforcement learning
  • Twin-delay
  • Two-phase training

Fingerprint

Dive into the research topics of 'Modified Triplet-Average Deep Deterministic Policy Gradient for interpretable neuro-fuzzy deep reinforcement learning'. Together they form a unique fingerprint.

Cite this