SOMAS
A Trusted Human-Multi-Agent Reinforcement Learning Interaction Framework
Install / Use
/learn @erwinmsmith/SOMASREADME
Safety-Oriented Multi-agent System
A Trusted Human-Multi-Agent Reinforcement Learning Interaction Framework
📖 Introduction
This repository implements a Multi-Agent System (MAS) framework for human-machine collaborative crisis response, combining vision-language models (VL) and reinforcement learning (RL) to enhance safety and reliability. The framework features:
- Real-Time Task Execution: Modular task chains with built-in safety rules and human oversight.
- Simulation Training: Experience replay library for risk prediction and optimization.
- Dynamic Trust Mechanism: Balances task utility and safety constraints through RL.
Key Contributions:
- Dual-mode architecture (online execution + offline simulation).
- First fine-tuned safe LLM and training dataset for emergency scenarios.
- 15% improvement in helpfulness and 40% reduction in risk response rate compared to baseline.
🚀 Quick Start
Installation
git clone https://github.com/erwinmsmith/SOMAS.git
cd SOMAS
pip install -r requirements.txt
Usage
- Real-Time Task Execution:
main.py --online
- Simulation Training:
main.py --offline
🧠 Framework Architecture

Core Components
-
Online Execution System
- Planning-Execution Pipeline: Modular task chains drive tool operations.
- Safety Guardrails: Predefined rules and GPT-4-based risk assessment.
-
Offline Simulation System
- Task Generation: Synthetic tasks from manual records and prior knowledge.
- Experience Replay: Optimizes RL policies for dynamic environments.
📊 Experimental Results
Key Metrics
| Domain | Model | Safety (↑) | Helpfulness (↑) | Risk Response Rate (↓) | |-------------|--------------------|------------|-----------------|-------------------------| | Safety-CV | Qwen2-7B-VL | 4.5 | 4.7 | 40% |
Highlights
- VL models reduced operational risks by 30% via image semantic parsing.
- Dynamic safety validation improved helpfulness by 15% over ToolEmu.
Others
If you need a detailed data for Safty(train or test), contact me duanzhenke@sscapewh.com/duanzhenke03@gmail.com
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
