TY - JOUR
T1 - Prediction-guided multi-objective reinforcement learning with corner solution search
AU - Ajani, Oladayo S.
AU - Fenyom, Ivan
AU - Darlan, Daison
AU - Mallipeddi, Rammohan
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/3
Y1 - 2025/3
N2 - Nowadays, several Reinforcement Learning (RL) tasks that feature conflicting objectives are being posed as multi-objective problems and consequently solved using dedicated Multi-Objective RL (MORL) algorithms. In MORL, the aim is to find several trade-off policies (Pareto optimal set) that optimize the featured objectives. To achieve this, several Evolutionary Multi-Objective optimization (EMO) schemes have been employed in the literature. Although it is well-established in the EMO community that the most important sub-tasks required to efficiently approximate the Pareto front (Pareto set in objective space) are those associated with the corner direction vectors, these sub-tasks are often not prioritized in most MORL schemes. Therefore in this paper, we propose a mechanism that prioritizes sub-tasks resulting from the corner direction weight vectors. Specifically, the sub-tasks are prioritized through a dynamic budget allocation scheme where higher budget allocations are assigned to the important sub-tasks in the initial stage of the evolution process. By so doing, the Pareto corner solutions can be approximated and contribute towards the effective realization of the optimal Pareto Front. The proposed scheme is incorporated into the Prediction Guided MORL algorithm (PGMORL) which is a high-performing evolutionary-based MORL Framework. Consequently, the resulting algorithm termed PGMORL with Corner Solution Search (csPGMORL) is favorably compared to the baseline PGMORL algorithm on five continuous robot locomotion control problems.
AB - Nowadays, several Reinforcement Learning (RL) tasks that feature conflicting objectives are being posed as multi-objective problems and consequently solved using dedicated Multi-Objective RL (MORL) algorithms. In MORL, the aim is to find several trade-off policies (Pareto optimal set) that optimize the featured objectives. To achieve this, several Evolutionary Multi-Objective optimization (EMO) schemes have been employed in the literature. Although it is well-established in the EMO community that the most important sub-tasks required to efficiently approximate the Pareto front (Pareto set in objective space) are those associated with the corner direction vectors, these sub-tasks are often not prioritized in most MORL schemes. Therefore in this paper, we propose a mechanism that prioritizes sub-tasks resulting from the corner direction weight vectors. Specifically, the sub-tasks are prioritized through a dynamic budget allocation scheme where higher budget allocations are assigned to the important sub-tasks in the initial stage of the evolution process. By so doing, the Pareto corner solutions can be approximated and contribute towards the effective realization of the optimal Pareto Front. The proposed scheme is incorporated into the Prediction Guided MORL algorithm (PGMORL) which is a high-performing evolutionary-based MORL Framework. Consequently, the resulting algorithm termed PGMORL with Corner Solution Search (csPGMORL) is favorably compared to the baseline PGMORL algorithm on five continuous robot locomotion control problems.
KW - Indicator-based evolutionary algorithm
KW - Multi-objective reinforcement learning
UR - https://www.scopus.com/pages/publications/85211461400
U2 - 10.1016/j.compeleceng.2024.109964
DO - 10.1016/j.compeleceng.2024.109964
M3 - Article
AN - SCOPUS:85211461400
SN - 0045-7906
VL - 122
JO - Computers and Electrical Engineering
JF - Computers and Electrical Engineering
M1 - 109964
ER -