Comparative Analysis of Multi-Agent System Methodologies for Accountability using LLM-Based Agents
Keywords:
Multi-Agent Systems, Accountability, Theory of Mind, Reinforcement Learning, LLM Agents, Explainable AI, AutoGen, Supply Chain ManagementAbstract
This pilot study presents a comparative analysis of three multi-agent system (MAS) methodologies-Theory of Mind (ToM), Multi-Agent Reinforcement Learning (MARL), and Hierarchical Supervision evaluated on accountability metrics in a supply chain disruption scenario. Using LLM-based agents powered by Groq's free API (Llama 3.1-8B), we conducted three experimental trials per methodology with two agents each. Results indicate that Theory of Mind achieves the highest overall accountability score (82.3%), excelling in decision attribution (100%) and explainability (90.6%), while Hierarchical Supervision demonstrates superior responsibility accuracy (94.5%). Notably, MARL shows critical limitations (40.4%) with 0% decision attribution rate, confirming its black-box nature. This study establishes metrics, methods, and preliminary findings using a reproducible framework with free-tier APIs to enable community replication. While limited in scale with only three trials per methodology, these findings warrant larger-scale investigation to establish statistical significance and generalizability across multiple scenarios and LLM architectures.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2026 DMPedia Lecture Notes in Multidisciplinary Research

This work is licensed under a Creative Commons Attribution 4.0 International License.