CoBRA
[đ CHI26 Best Paper] CoBRA: Reproducible control of LLM agent behavior via classic social science experiments
Install / Use
/learn @AISmithLab/CoBRAREADME
<video src="https://github.com/user-attachments/assets/028ca8f4-6edd-426e-b436-2c3b796d81a0" controls width="700"></video>
đĄ What is Cognitive Bias?
Systematic deviations from rational judgment in human cognition and decision-making. For example, Framing Effect: "90% survival rate" vs. "10% mortality rate" â logically identical, yet people make different choices based on how information is framed.
Reproducibility and controllability are fundamental to scientific research. Yet implicit natural language descriptions â the dominant approach for specifying social agent behaviors in nearly all LLM-based social simulations â often fail to yield consistent behavior across models or capture the nuances of the descriptions.
CoBRA (Cognitive Bias Regulator for Social Agents) is a novel toolkit that lets researchers explicitly specify desired nuances in LLM-based agents and obtain consistent behavior across models.
Through CoBRA, we show how to operationalize validated social science knowledge as reusable "gym" environments for AI â an approach that generalizes to richer social and affective simulations.
<p align="center"> <img src="figures/fig1.png" alt="CoBRA Overview" width="800"/> <br> <em>The problem and our solution: from inconsistent agent behaviors under implicit specifications to explicit, quantitative control.</em> </p>At the heart of CoBRA is a novel closed-loop system with two core components:
- Cognitive Bias Index â measures the cognitive bias of a social agent by quantifying its reactions in validated classic social science experiments
- Behavioral Regulation Engine â aligns the agent's behavior to exhibit controlled cognitive bias, via three control methods:
- Prompt Engineering (input space control)
- Representation Engineering (activation space control)
- Fine-tuning (parameter space control)
Quick Start (3 Steps)
# 1. Install dependencies
pip install -r requirements.txt
# 2. Navigate to the unified bias control module
cd examples/unified_bias
# 3. Run a bias experiment
python pipelines.py --bias authority --method repe-linear --model Mistral-7B
That's it. The system will measure and control the agent's Authority Effect bias.
Repository Structure
CoBRA/
âââ control/ # Core bias control engine
âââ examples/
â âââ unified_bias/ # Main entry point (START HERE)
â â âââ pipelines.py # Unified experiment runner
â â âââ run_pipelines.py # CLI interface
â â âââ ablation/ # Ablation studies
â â âââ README.md # Full usage guide
â âââ authority/ # Authority Effect utils
â âââ bandwagon/ # Bandwagon Effect utils
â âââ confirmation/ # Confirmation Bias utils
â âââ framing/ # Framing Effect utils
âââ generator/ # Data generation utilities
âââ data_generated/ # Generated experimental data
âââ webdemo/ # Web demonstration interface
âââ requirements.txt # Python dependencies
Key Components
| Component | Description | Documentation |
|-----------|-------------|---------------|
| Cognitive Bias Index | Measures bias strength via classic experiments | data/data_README.md |
| Behavioral Regulation Engine | Three control methods (Prompt/RepE/Finetune) | control/control_README.md |
| Unified Pipeline | Run full experiments with one command | examples/unified_bias/README.md |
| Ablation Studies | Test model/persona/temperature sensitivity | examples/unified_bias/ablation/README.md |
| Data Generator | Create custom bias scenarios and responses | generator/README.md |
Supported Biases & Experiments
| Bias Type | Paradigms | Data Directory | Control Range |
|-----------|-----------|----------------|---------------|
| Authority Effect | Milgram Obedience, Stanford Prison | data/authority/ | 0-4 scale |
| Bandwagon Effect | Asch's Line, Hotel Towel | data/bandwagon/ | 0-4 scale |
| Confirmation Bias | Wason Selection, Biased Information | data/confirmation/ | 0-4 scale |
| Framing Effect | Asian Disease, Investment/Insurance | data/framing/ | 0-4 scale |
Citation
If you use CoBRA in your research, please cite our paper:
@article{liu2025cobra,
title={CoBRA: Programming Cognitive Bias in Social Agents Using Classic Social Science Experiments},
author={Liu, Xuan and Shang, Haoyang and Jin, Haojian},
journal={arXiv preprint arXiv:2509.13588},
year={2025}
}
Paper Link: https://arxiv.org/abs/2509.13588
License
MIT License - see LICENSE for details
Contact
For questions, please contact the corresponding author Xuan Liu at xul049@ucsd.edu, or file a GitHub Issue to report bugs and request features.
Need help? Check examples/unified_bias/README.md for detailed walkthroughs. The finetuning code is in the finetuning branch.
