SkillAgentSearch skills...

VenusFactory2

๐Ÿญ AI agent platform with skills for protein engineering, the noob-friendly AI tutorial tool for life science professionals.

Install / Use

/learn @ai4protein/VenusFactory2

README

<div align="right"> <a href="README.md">English</a> | <a href="README_CN.md">็ฎ€ไฝ“ไธญๆ–‡</a> </div> <p align="center"> <img src="img/banner_2503.png" width="70%" alt="VenusFactory2 Banner"> </p> <div align="center">

GitHub stars GitHub forks GitHub issues GitHub license

Python Version Documentation Downloads Youtube Demo

๐Ÿค– Agent-Driven Protein Engineering Platform One platform, three interfaces, infinite possibilities

</div>

๐ŸŒŸ Recent News

<details> <summary>๐Ÿ“จ Join our WeChat Group / ๐Ÿ“ Share Your Feedback</summary> <p align="center"> <img src="img/wechat.png" width="60%" alt="WeChat Group"> </p> </details>

๐ŸŽฏ What is VenusFactory2?

VenusFactory2 is an Agent-driven protein engineering platform combining 40+ AI models with 11 biological databases. Designed for everyone โ€” from biologists to AI researchers.

<p align="center"> <img src="https://img.shields.io/badge/๐Ÿค–_Agent_Driven-Core-FF6B6B?style=for-the-badge"> <img src="https://img.shields.io/badge/Models-40+-4ECDC4?style=for-the-badge"> <img src="https://img.shields.io/badge/Databases-11+-95E1D3?style=for-the-badge"> <img src="https://img.shields.io/badge/Tools-9_Categories-F38181?style=for-the-badge"> </p>

๐Ÿš€ Why VenusFactory2?

| ๐Ÿค– Agent-First | ๐ŸŽฏ Three Interfaces | โšก Zero to Results | |:------------------:|:----------------------:|:---------------------:| | Natural language โ†’ Multi-step automation | Web UI / REST API / CLI | Upload โ†’ Predict in seconds | | 40+ models + 11 databases | Same power, different styles | Or train custom models in minutes |

๐Ÿ“– Easy to Learn: Designed for life science professionals with no programming background required. Intuitive Web UI, comprehensive bilingual documentation, rich examples and video tutorials help you quickly grow from beginner to protein AI expert.

๐Ÿ’ก Capabilities at a Glance

| Task | Solution | Time | |:-----|:---------|:-----| | ๐Ÿงฌ Mutation effects | ESM-2, ProSST, ProtSSN (zero-shot) | <1 min | | ๐ŸŽฏ Protein function | 30+ fine-tuned models | <30 sec | | ๐Ÿ”ฌ Custom training | 7 PEFT methods (LoRA, QLoRA, etc.) | 10-60 min | | ๐Ÿ’พ Data download | AlphaFold, UniProt, RCSB, KEGG, etc. | Real-time | | ๐Ÿ“š Literature | AI-powered search & analysis | <2 min |


โšก Quick Start

1. Install

git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install -r requirements.txt  # Detailed guide below โ†“

2. Launch

# Web UI (Recommended)
python src/webui.py --mode web  # โ†’ http://localhost:7860

# REST API
python src/api_server.py  # โ†’ http://localhost:5000/docs

# CLI
bash script/train/train_plm_lora.sh

3. Get Results

<details> <summary><b>๐Ÿค– Try Agent-0.1 | โšก Quick Tools | ๐Ÿ”ฌ Train Models</b> (Click to expand examples)</summary>

Agent-0.1 (Natural Language)

Q: "Predict stability for sequence MKTAYIAKQRQISFV..."
โ†’ Agent auto-selects model โ†’ Runs prediction โ†’ Returns results + explanations

Quick Mutation Scoring

Upload: PDB/FASTA โ†’ Mutations: A23V, K45R โ†’ Get: Stability scores

Train Your Model

Model: ESM2-650M โ†’ Dataset: DeepSol โ†’ Method: LoRA โ†’ 15 min โ†’ Trained model โœ“
</details> <p align="center"> <video width="70%" controls> <source src="./img/venusfactory.mp4" type="video/mp4"> </video> </p>

๐Ÿค– Agent-0.1: The Brain

Agent-0.1 orchestrates all tools via natural language. Powered by LangGraph + LangChain.

You: "Design thermostable mutations for PDB:1ABC"
         โ†“
    ๐Ÿค– Agent Planning
         โ†“
  ๐Ÿ“ฅ Download โ†’ ๐Ÿงฌ Predict โ†’ ๐ŸŽฏ Score โ†’ ๐Ÿ“Š Report
  RCSB PDB     ESM-2 scan    Stability   Ranked list
<details> <summary><b>โœจ Agent Capabilities</b></summary>

| Category | Features | |:---------|:---------| | ๐Ÿ”ฌ Analysis | Mutation prediction โ€ข Function/stability scoring โ€ข Structure analysis | | ๐Ÿ’พ Data | Multi-database search โ€ข Format conversion โ€ข Batch processing | | ๐Ÿง  Planning | Multi-step automation โ€ข Tool orchestration โ€ข Error handling | | ๐Ÿ“š Research | Literature mining โ€ข Family analysis โ€ข Report generation |

</details> <details> <summary><b>๐Ÿ’ฌ Example Conversations</b></summary>

Mutation Design:

You: "Improve thermostability of MKTAYIAKQR..."
Agent: โœ“ ESM-2 scanning... โœ“ Stability scoring...
โ†’ Top 3: A5V (+2.8 kcal/mol), K9R (+1.9), T2S (+1.5)

Database Search:

You: "Find lysozyme structures <2.0ร… resolution"
Agent: โœ“ Searching RCSB... โ†’ Found 47 structures
โ†’ Downloaded to: temp_outputs/lysozyme_structures/
</details>

๐Ÿ’ก Note: Requires API key (OpenAI/Anthropic). Currently in Beta.


๐Ÿ—๏ธ Architecture

๐ŸŒ Interfaces: Web UI | REST API | CLI
        โ†“
   ๐Ÿค– Agent Layer (LangGraph + LangChain)
        โ†“
   ๐Ÿ”ง Application: Train | Eval | Predict | Tools
        โ†“
   ๐Ÿ› ๏ธ Core Tools: 9 categories (mutation, database, search, etc.)
        โ†“
   ๐Ÿ“Š Resources: 40+ Models | 30+ Datasets | 11+ Databases
<details> <summary><b>๐Ÿ“š Integrated Resources</b></summary>

Models (40+): ESM, ProtBert, ProtT5, Venus/PETA/ProSST series

Databases (11+): AlphaFold โ€ข RCSB PDB โ€ข UniProt โ€ข NCBI โ€ข KEGG โ€ข STRING โ€ข BRENDA โ€ข ChEMBL โ€ข HPA โ€ข FDA โ€ข Foldseek

Datasets (30+): Function โ€ข Localization โ€ข Stability โ€ข Solubility โ€ข Mutation fitness

</details> <details> <summary><b>๐Ÿ”ง Tool Categories</b></summary>

| Tool | Description | Agent | CLI | |:-----|:------------|:-----:|:---:| | ๐Ÿงฌ Mutation | ESM-1v, ESM-2, ProSST, ProtSSN, MIF-ST | โœ… | โœ… | | ๐ŸŽฏ Prediction | 30+ fine-tuned models | โœ… | โœ… | | ๐Ÿ’พ Database | 11 integrations | โœ… | โœ… | | ๐Ÿ” Search | PubMed, FDA, patents | โœ… | โœ… | | ๐Ÿ‹๏ธ Training | LoRA, QLoRA, DoRA, etc. | โœ… | โœ… | | ๐Ÿ“ File | Format conversion | โœ… | โœ… | | ๐Ÿ”ฌ Denovo | Protein design | โœ… | โœ… | | ๐Ÿงช Discovery | Novel discovery | โœ… | โœ… | | ๐Ÿ“Š Visualize | 3D viewer | โœ… | โœ… |

</details>

๐Ÿงฌ Supported Models

<details> <summary><b>40+ Protein Language Models</b> (Click to expand)</summary>

Venus Series (Liang's Lab): ProSST-20/128/512/1024/2048/4096 (110M) โ€ข ProPrime-690M โ€ข VenusPLM-300M โ€ข PETA-base/bpe/unigram (80M)

ESM Series (Meta AI): ESM2: 8M, 35M, 150M, 650M, 3B, 15B โ€ข ESM-1v: 5 models (650M each)

ProtBert & ProtT5: ProtBert-Uniref100/BFD (420M) โ€ข IgBert (420M) โ€ข ProtT5-XL/XXL (3B-11B) โ€ข Ankh-base/large (450M-1.2B)

Selection Guide:

  • GPU <8GB: ESM2-8M/35M, ProSST
  • GPU 8-16GB: ESM2-150M/650M, ProtBert
  • GPU 24GB+: ESM2-3B, ProtT5-XL
  • Multi-GPU: ESM2-15B, ProtT5-XXL

By Task:

  • Classification: ESM2, ProtBert
  • Structure: Ankh
  • Generation: ProtT5
  • Antibody: IgBert/IgT5
  • Lightweight: ProSST, PETA
</details>

๐Ÿ“š Supported Datasets

<details> <summary><b>30+ Supervised + Zero-Shot Datasets</b></summary>

Zero-Shot: VenusMutHub โ€ข ProteinGym (217 DMS)

Function: EC โ€ข GO_BP โ€ข GO_CC โ€ข GO_MF Localization: DeepLocBinary โ€ข DeepLocMulti โ€ข DeepLoc2Multi Stability: Thermostability โ€ข TAPE_Stability Solubility: DeepSol โ€ข DeepSoluE โ€ข eSOL โ€ข ProtSolM โ€ข PETA_CHS/LGK/TEM_Sol Mutation: FLIP_AAV (7 splits) โ€ข FLIP_GB1 (5 splits) โ€ข TAPE_Fluorescence Others: DeepET_Topt โ€ข MetalIonBinding โ€ข SortingSignal โ€ข PaCRISPR

All datasets available on HuggingFace

</details>

๐Ÿ“ฆ Installation

<details> <summary><b>๐ŸŽ macOS (M1/M2/M3)</b></summary>
git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
pip install torch_scatter torch-sparse torch-geometric -f https://data.pyg.org/whl/torch-2.8.0+cpu.html
pip install -r requirements_for_macOS.txt
</details> <details> <summary><b>๐ŸชŸ Windows / ๐Ÿง Linux (CUDA 12.8)</b></summary>
git clone https://github.com/AI4Protein/VenusFactory2.git && cd VenusFactory2
conda create -n venus python=3.12 && conda activate venus
pip install torch==2.8.0 torchvision --index-url https://download.pytorch.org/whl/cu128
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.8.0+cu128.html
pip install -r requirements.txt
</details> <details> <summary><b>๐ŸชŸ Windows / ๐Ÿง Linux (CUDA 11.8)</b></summary>
git clone ht
View on GitHub
GitHub Stars203
CategoryData
Updated9h ago
Forks28

Languages

Python

Security Score

85/100

Audited on Mar 27, 2026

No findings