VenusFactory2
๐ญ AI agent platform with skills for protein engineering, the noob-friendly AI tutorial tool for life science professionals.
Install / Use
/learn @ai4protein/VenusFactory2README
๐ค Agent-Driven Protein Engineering Platform One platform, three interfaces, infinite possibilities
</div>๐ Recent News
- [2026-01-23] ๐ Added VenusX in VenusFactory2
- [2025-08-10] ๐ Free website released at venusfactory.cn/playground
- [2025-04-19] ๐ VenusREM #1 in ProteinGym & VenusMutHub!
๐ฏ What is VenusFactory2?
VenusFactory2 is an Agent-driven protein engineering platform combining 40+ AI models with 11 biological databases. Designed for everyone โ from biologists to AI researchers.
<p align="center"> <img src="https://img.shields.io/badge/๐ค_Agent_Driven-Core-FF6B6B?style=for-the-badge"> <img src="https://img.shields.io/badge/Models-40+-4ECDC4?style=for-the-badge"> <img src="https://img.shields.io/badge/Databases-11+-95E1D3?style=for-the-badge"> <img src="https://img.shields.io/badge/Tools-9_Categories-F38181?style=for-the-badge"> </p>๐ Why VenusFactory2?
| ๐ค Agent-First | ๐ฏ Three Interfaces | โก Zero to Results | |:------------------:|:----------------------:|:---------------------:| | Natural language โ Multi-step automation | Web UI / REST API / CLI | Upload โ Predict in seconds | | 40+ models + 11 databases | Same power, different styles | Or train custom models in minutes |
๐ Easy to Learn: Designed for life science professionals with no programming background required. Intuitive Web UI, comprehensive bilingual documentation, rich examples and video tutorials help you quickly grow from beginner to protein AI expert.
๐ก Capabilities at a Glance
| Task | Solution | Time | |:-----|:---------|:-----| | ๐งฌ Mutation effects | ESM-2, ProSST, ProtSSN (zero-shot) | <1 min | | ๐ฏ Protein function | 30+ fine-tuned models | <30 sec | | ๐ฌ Custom training | 7 PEFT methods (LoRA, QLoRA, etc.) | 10-60 min | | ๐พ Data download | AlphaFold, UniProt, RCSB, KEGG, etc. | Real-time | | ๐ Literature | AI-powered search & analysis | <2 min |
โก Quick Start
1. Install
git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install -r requirements.txt # Detailed guide below โ
2. Launch
# Web UI (Recommended)
python src/webui.py --mode web # โ http://localhost:7860
# REST API
python src/api_server.py # โ http://localhost:5000/docs
# CLI
bash script/train/train_plm_lora.sh
3. Get Results
<details> <summary><b>๐ค Try Agent-0.1 | โก Quick Tools | ๐ฌ Train Models</b> (Click to expand examples)</summary>Agent-0.1 (Natural Language)
Q: "Predict stability for sequence MKTAYIAKQRQISFV..."
โ Agent auto-selects model โ Runs prediction โ Returns results + explanations
Quick Mutation Scoring
Upload: PDB/FASTA โ Mutations: A23V, K45R โ Get: Stability scores
Train Your Model
Model: ESM2-650M โ Dataset: DeepSol โ Method: LoRA โ 15 min โ Trained model โ
</details>
<p align="center">
<video width="70%" controls>
<source src="./img/venusfactory.mp4" type="video/mp4">
</video>
</p>
๐ค Agent-0.1: The Brain
Agent-0.1 orchestrates all tools via natural language. Powered by LangGraph + LangChain.
You: "Design thermostable mutations for PDB:1ABC"
โ
๐ค Agent Planning
โ
๐ฅ Download โ ๐งฌ Predict โ ๐ฏ Score โ ๐ Report
RCSB PDB ESM-2 scan Stability Ranked list
<details>
<summary><b>โจ Agent Capabilities</b></summary>
| Category | Features | |:---------|:---------| | ๐ฌ Analysis | Mutation prediction โข Function/stability scoring โข Structure analysis | | ๐พ Data | Multi-database search โข Format conversion โข Batch processing | | ๐ง Planning | Multi-step automation โข Tool orchestration โข Error handling | | ๐ Research | Literature mining โข Family analysis โข Report generation |
</details> <details> <summary><b>๐ฌ Example Conversations</b></summary>Mutation Design:
You: "Improve thermostability of MKTAYIAKQR..."
Agent: โ ESM-2 scanning... โ Stability scoring...
โ Top 3: A5V (+2.8 kcal/mol), K9R (+1.9), T2S (+1.5)
Database Search:
You: "Find lysozyme structures <2.0ร
resolution"
Agent: โ Searching RCSB... โ Found 47 structures
โ Downloaded to: temp_outputs/lysozyme_structures/
</details>
๐ก Note: Requires API key (OpenAI/Anthropic). Currently in Beta.
๐๏ธ Architecture
๐ Interfaces: Web UI | REST API | CLI
โ
๐ค Agent Layer (LangGraph + LangChain)
โ
๐ง Application: Train | Eval | Predict | Tools
โ
๐ ๏ธ Core Tools: 9 categories (mutation, database, search, etc.)
โ
๐ Resources: 40+ Models | 30+ Datasets | 11+ Databases
<details>
<summary><b>๐ Integrated Resources</b></summary>
Models (40+): ESM, ProtBert, ProtT5, Venus/PETA/ProSST series
Databases (11+): AlphaFold โข RCSB PDB โข UniProt โข NCBI โข KEGG โข STRING โข BRENDA โข ChEMBL โข HPA โข FDA โข Foldseek
Datasets (30+): Function โข Localization โข Stability โข Solubility โข Mutation fitness
</details> <details> <summary><b>๐ง Tool Categories</b></summary>| Tool | Description | Agent | CLI | |:-----|:------------|:-----:|:---:| | ๐งฌ Mutation | ESM-1v, ESM-2, ProSST, ProtSSN, MIF-ST | โ | โ | | ๐ฏ Prediction | 30+ fine-tuned models | โ | โ | | ๐พ Database | 11 integrations | โ | โ | | ๐ Search | PubMed, FDA, patents | โ | โ | | ๐๏ธ Training | LoRA, QLoRA, DoRA, etc. | โ | โ | | ๐ File | Format conversion | โ | โ | | ๐ฌ Denovo | Protein design | โ | โ | | ๐งช Discovery | Novel discovery | โ | โ | | ๐ Visualize | 3D viewer | โ | โ |
</details>๐งฌ Supported Models
<details> <summary><b>40+ Protein Language Models</b> (Click to expand)</summary>Venus Series (Liang's Lab): ProSST-20/128/512/1024/2048/4096 (110M) โข ProPrime-690M โข VenusPLM-300M โข PETA-base/bpe/unigram (80M)
ESM Series (Meta AI): ESM2: 8M, 35M, 150M, 650M, 3B, 15B โข ESM-1v: 5 models (650M each)
ProtBert & ProtT5: ProtBert-Uniref100/BFD (420M) โข IgBert (420M) โข ProtT5-XL/XXL (3B-11B) โข Ankh-base/large (450M-1.2B)
Selection Guide:
- GPU <8GB: ESM2-8M/35M, ProSST
- GPU 8-16GB: ESM2-150M/650M, ProtBert
- GPU 24GB+: ESM2-3B, ProtT5-XL
- Multi-GPU: ESM2-15B, ProtT5-XXL
By Task:
- Classification: ESM2, ProtBert
- Structure: Ankh
- Generation: ProtT5
- Antibody: IgBert/IgT5
- Lightweight: ProSST, PETA
๐ Supported Datasets
<details> <summary><b>30+ Supervised + Zero-Shot Datasets</b></summary>Zero-Shot: VenusMutHub โข ProteinGym (217 DMS)
Function: EC โข GO_BP โข GO_CC โข GO_MF Localization: DeepLocBinary โข DeepLocMulti โข DeepLoc2Multi Stability: Thermostability โข TAPE_Stability Solubility: DeepSol โข DeepSoluE โข eSOL โข ProtSolM โข PETA_CHS/LGK/TEM_Sol Mutation: FLIP_AAV (7 splits) โข FLIP_GB1 (5 splits) โข TAPE_Fluorescence Others: DeepET_Topt โข MetalIonBinding โข SortingSignal โข PaCRISPR
All datasets available on HuggingFace
</details>๐ฆ Installation
<details> <summary><b>๐ macOS (M1/M2/M3)</b></summary>git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
pip install torch_scatter torch-sparse torch-geometric -f https://data.pyg.org/whl/torch-2.8.0+cpu.html
pip install -r requirements_for_macOS.txt
</details>
<details>
<summary><b>๐ช Windows / ๐ง Linux (CUDA 12.8)</b></summary>
git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install torch==2.8.0 torchvision --index-url https://download.pytorch.org/whl/cu128
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.8.0+cu128.html
pip install -r requirements.txt
</details>
<details>
<summary><b>๐ช Windows / ๐ง Linux (CUDA 11.8)</b></summary>
git clone ht
