Joint MECHATRONICS 2025, ROBOTICS 2025 Paper Abstract

Joint MECHATRONICS 2025, ROBOTICS 2025 Paper Abstract

Paper WeCT4.2

Hamed, Oussama (Aix Marseille University), Labbadi, Moussa (Aix-Marseille University), Zerrougui, Mohamed (Aix Marseille University)

Risk-Aware Decentralized Learning and Control in Multi-Robot Systems

Scheduled for presentation during the Regular Session "Cooperative Multi-Robot Control" (WeCT4), Wednesday, July 16, 2025, 16:50−17:10, Room 108

Joint 10th IFAC Symposium on Mechatronic Systems and 14th Symposium on Robotics, July 15-18, 2025, Paris, France

This information is tentative and subject to change. Compiled on July 16, 2025

Keywords Multi cooperative robot control, Learning robot control, Modeling and identification

Abstract

Multi-robot systems have the potential to perform a wide variety of tasks and improve the efficiency of task execution. However, a significant challenge arises when multiple robots navigate in shared environments contain- ing unknown zones and static or dynamic obstacles, increasing the risk of collisions. Trajectory planning for such complex environments requires sophisticated methods and high-cost robots. The costs rise with an increase in the number of robots, environmental complexity, and demands on the task. To address these, this paper proposes an online decentralized receding horizon approach using an improved Q-learning algorithm for multi-robot systems with integrated risk management. The method uses Q-learning for single-agent path planning to determine collision-free optimal trajectories from initial to final positions. To accelerate the slow convergence of traditional Q-learning, avoid negative rewards from random exploratory actions, and improve learning efficiency, the Artificial Potential Field (APF) method is integrated with Q-learning. Each robot in the swarm applies the improved Q-learning algorithm, and preserves a unique policy for navigation. To ensure decentralised online trajectory planning, robots within the same restricted area exchange their policies through communication. This shared information allows collision-free optimized trajectories to be dynamically replanned. The algorithm follows the receding horizon principle, providing adaptability to changing environments. Numerical simulations validate the proposed method, demonstrating its effectiveness and feasibility in multi-robot trajectory planning under complex and uncertain conditions.