Authors :
Rajarshi Tarafdar
Volume/Issue :
Volume 10 - 2025, Issue 3 - March
Google Scholar :
https://tinyurl.com/mpcvx9hd
Scribd :
https://tinyurl.com/3m7zk8nh
DOI :
https://doi.org/10.38124/ijisrt/25mar1376
Google Scholar
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 15 to 20 days to display the article.
Abstract :
Embodied and Multi-Agent Reinforcement Learning (MARL) lies at the intersection of artificial intelligence,
robotics, and complex systems theory, enabling multiple agents whether physical or virtual to learn coordinated behaviors
through direct interactions with their environments. By leveraging advances in deep reinforcement learning, decentralized
decision-making, and communication protocols, MARL has shown promise in a range of applications such as cooperative
robotics, swarm intelligence, autonomous driving, and large-scale simulations. Unlike single-agent reinforcement learning,
the multi-agent paradigm introduces new layers of complexity: each agent must learn to navigate both the environment and
the dynamic behavior of peers or competitors, often under conditions of partial observability and limited communication.
This paper offers a comprehensive review and analysis of key topics driving progress in MARL. We begin by exploring
social learning and emergent communication, focusing on how agents learn to share information or signals that enhance
teamwork. We then delve into Sim2Real transfer approaches, critical for bridging the gap between simulation-based
training and real-world deployments, particularly in safety-critical domains. Hierarchical reinforcement learning serves as
a powerful framework to handle tasks at varying levels of complexity and abstraction, improving interpretability and sample
efficiency. Lastly, we examine safety and robustness challenges, including adversarial interactions, non-stationarity, and
explicit constraints that must be integrated into multi-agent systems. By highlighting the underlying mathematical
formalisms, empirical methods, and open research questions, this paper aims to map out current trends and future directions
in Embodied and Multi-Agent Reinforcement Learning.
Keywords :
Adversarial Interactions, Embodied AI, Hierarchical RL, Multi-Agent Coordination, Reinforcement Learning, Sim2Real Transfer.
References :
- Below is a consolidated list of references cited throughout this paper. References are numbered based on their first mention in the text.
- M. Ozaki, Y. Adachi, Y. Iwahori, and N. Ishii, “Application of fuzzy theory to writer recognition of Chinese characters,” International Journal of Modelling and Simulation, 18(2), 1998, 112-116.
- C. Li and D. Song, “Policy distillation in multi-agent reinforcement learning: Bridging individual and cooperative tasks,” in Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, 2020, 1231-1242. (12)
- R. Lowe, Y. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, 2017, 6379-6390. (12)
- J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017, 23-30. (12)
- N. Kulkarni, A. Narasimhan, A. Saeedi, and B. Faltings, “Hierarchical reinforcement learning in multi-agent settings: A survey,” in Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, 2020, 789-795. (12)
- T. Phan, B. Egger, and J. Bayer, “Safety guarantees in multi-agent reinforcement learning through shielding,” arXiv preprint arXiv:2105.09311, 2021. (12)
- G. Papoudakis, F. Christianos, G. Rahman, and S. Albrecht, “Dealing with non-stationarity in multi-agent deep reinforcement learning,” arXiv preprint arXiv:1906.04737, 2019. (12)
- J. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, “Counterfactual multi-agent policy gradients,” in Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), New Orleans, LA, 2018, 2974-2982. (12)
- D.S. Chan, Theory and implementation of multidimensional discrete systems for signal processing, doctoral diss., Massachusetts Institute of Technology, Cambridge, MA, 1978. (12)
- W.J. Book, “Modelling design and control of flexible manipulator arms: A tutorial review,” Proc. 29th IEEE Conf. on Decision and Control, San Francisco, CA, 1990, 500-506. (12)
Embodied and Multi-Agent Reinforcement Learning (MARL) lies at the intersection of artificial intelligence,
robotics, and complex systems theory, enabling multiple agents whether physical or virtual to learn coordinated behaviors
through direct interactions with their environments. By leveraging advances in deep reinforcement learning, decentralized
decision-making, and communication protocols, MARL has shown promise in a range of applications such as cooperative
robotics, swarm intelligence, autonomous driving, and large-scale simulations. Unlike single-agent reinforcement learning,
the multi-agent paradigm introduces new layers of complexity: each agent must learn to navigate both the environment and
the dynamic behavior of peers or competitors, often under conditions of partial observability and limited communication.
This paper offers a comprehensive review and analysis of key topics driving progress in MARL. We begin by exploring
social learning and emergent communication, focusing on how agents learn to share information or signals that enhance
teamwork. We then delve into Sim2Real transfer approaches, critical for bridging the gap between simulation-based
training and real-world deployments, particularly in safety-critical domains. Hierarchical reinforcement learning serves as
a powerful framework to handle tasks at varying levels of complexity and abstraction, improving interpretability and sample
efficiency. Lastly, we examine safety and robustness challenges, including adversarial interactions, non-stationarity, and
explicit constraints that must be integrated into multi-agent systems. By highlighting the underlying mathematical
formalisms, empirical methods, and open research questions, this paper aims to map out current trends and future directions
in Embodied and Multi-Agent Reinforcement Learning.
Keywords :
Adversarial Interactions, Embodied AI, Hierarchical RL, Multi-Agent Coordination, Reinforcement Learning, Sim2Real Transfer.