Optimas
(ICLR 2026) Optimas: Optimizing Compound AI Systems
Install / Use
/learn @snap-stanford/OptimasREADME
NEWS
- [Jul 2025] We release Optimas!
What is Optimas?
Optimas is a unified framework for end-to-end optimization of compound AI systems. While traditional optimization methods focus on single configuration types—such as prompts or hyperparameters—modern compound AI systems require coordinated optimization across multiple heterogeneous configuration types that work well together.
Optimas addresses this fundamental challenge through its core innovation: Globally Aligned Local Reward Functions (LRFs) that align each component's optimization with global system performance. This enables efficient, decentralized optimization while ensuring that local improvements contribute meaningfully to global rewards, backed by formal theoretical guarantees.
🔥 Check out our website for more overview!
0. Set up API keys
export OPENAI_API_KEY=[YOUR KEY]
export ANTHROPIC_API_KEY=[YOUR KEY]
1. Generate Preference Data (used for reward model and optimization)
python -m scripts.generate_reward_dataset scripts/configs/generate/{dataset}.yaml
This runs reward data generation over a given dataset + system. Output: HuggingFace-style reward dataset saved locally.
2. Train Initial Reward Model (Local Reward Functions)
CUDA_VISIBLE_DEVICES=2,3,4,5 torchrun --master_port=56781 --nnodes=1 --nproc_per_node=4 -m scripts.train_reward_model scripts/configs/train/{dataset}.yaml
where nnodes is the number of number of nodes, and nproc_per_node is the number of GPUs per node.
Trains a reward model using preference data. You need to include WANDB_ENTITY and WANDB_PROJECT in the .env file or export them in your shell:
export WANDB_ENTITY=your_wandb_entity
export WANDB_PROJECT=your_wandb_project
3. Run Optimization (Prompts, PPO LoRA, Hyperparameters)
CUDA_VISIBLE_DEVICES=6 torchrun --master_port=56790 --nnodes=1 --nproc_per_node=1 -m scripts.optimize_system scripts/configs/optimize/{dataset}.yaml
Uses Globally Aligned Local Reward Functions (LRFs) to optimize component variables. Supports: - prompt tuning (opro, mipro, copro) - hyperparameter search - PPO for local models via LoRA (works with vLLM + OpenAI API) Each component can be optimized independently or jointly.
Remember to include WANDB_ENTITY and WANDB_PROJECT in the .env file or export them in your shell.
4. Evaluate Final System
python scripts/eval_system.py scripts/configs/eval/{dataset}.yaml
Evaluates a saved system state dict on val/test sets. Supports test repeat for randomized components.
Component Types Supported
- Prompt templates (as strings)
- Model config (e.g., model name, temperature)
- Hyperparameters (grid search)
- Local LLM weights (LoRA + PPO finetuning)
Each component declares: - input_fields - output_fields - variable (what to optimize) - variable_search_space (optional)
Adding Your Own System
- Define your pipeline in examples/systems/<your_system>.py as
system_engine() - Register it in examples/systems/init.py
- Add your dataset to examples/datasets/
Example:
def system_engine():
return CompoundAISystem(
components={...},
final_output_fields=[...],
ground_fields=[...],
eval_func=...
)
Reference
@inproceedings{optimas,
title = {Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards},
author = {
Shirley Wu and Parth Sarthi and Shiyu Zhao and
Aaron Lee and Herumb Shandilya and
Adrian Mladenic Grobelnik and Nurendra Choudhary and
Eddie Huang and Karthik Subbian and
Linjun Zhang and Diyi Yang and
James Zou and Jure Leskovec
},
year = {2026},
booktitle = {ICLR},
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
workshop-rules
Materials used to teach the summer camp <Data Science for Kids>
