ChemEagle

This is the official code of the paper "A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature"

Generate Convert Improve

Install / Use

/learn @CYF2000127/ChemEagle

About this skill

Quality Score

0/100

README

ChemEAGLE

visualization

This is the official code of the paper "A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature".

:sparkles: Highlights

<p align="justify"> In this work, we present ChemEAGLE, a multimodal large language model (MLLM)-based multi-agent system that integrates diverse chemical information extraction tools to extract multimodal chemical reactions. By integrating ten expert-designed tools and seven chemical information extraction agents, ChemEAGLE not only processes individual modalities but also utilizes MLLMs' reasoning capabilities to unify extracted data, ensuring more accurate and comprehensive reaction representations. By bridging multimodal gaps, our approach significantly improves automated chemical knowledge extraction, facilitating more robust AI-driven chemical research.

visualization

<div align="center"> An example workflow of ChemEAGLE. Each agent handles a specific sub-task, from reaction template parsing and molecular recognition to SMILES reconstruction and condition role interpretation, ensuring accurate, structured chemical data integration. </div>

🧩 Agents Overview

| Agent Name | Category | Main Function | | --------------------------------------------------- | ------------------- | ------------------------------------------------------------------- | | Planner | Planning | Analyzes input, plans extraction steps, assigns sub-tasks to agents | | Plan Observer | Validation | Monitors extraction workflow, ensures logical plan | | Action Observer | Validation | Oversees agent actions, validates consistency and correctness | | Reaction Template Parsing Agent | Extraction | Parses reaction templates, integrates R-group substitutions | | Molecular Recognition Agent | Extraction | Detects and interprets all molecules in graphics | | Structure-based Table R-group Substitution Agent | Extraction | Substitutes R-groups and reconstructs reactant SMILES from product variant structure-based tables | | Text-based Table R-group Substitution Agent | Extraction | Substitutes R-groups and reconstructs SMILES from text-based tables | | Condition Interpretation Agent | Extraction | Extracts and categorizes reaction conditions (solvent, temp, etc.) | | Text Extraction Agent | Extraction | Extracts and aligns reaction info from associated texts | | Data Structure Agent | Output | Compiles structured output for downstream applications |

🛠️ Toolkits Used in ChemEAGLE

| Tool Name | Category | Description | | ----------------------- | --------------------------------- | ------------------------------------------------------ | | TesseractOCR | Computer Vision | Optical character recognition for text in graphics | | TableParser | Computer Vision | Table structure detection and parsing | | MolDetector | Computer Vision | Locates and segments molecules within graphics | | MolNER | Text-based Information Extraction | Chemical named entity recognition from text | | ChemRxnExtractor | Text-based Information Extraction | Extracts chemical reactions and roles from text | | Image2Graph | Molecular Recognition | Converts molecular sub-images to graph representations | | Graph2SMILES | Molecular Recognition | Converts molecular graphs to SMILES strings | | SMILESReconstructor | Molecular Recognition | Reconstructs reactant SMILES from product variants | | RxnImgParser | Reaction Image Parsing | Parsing reaction template images into bounding boxes and components | | RxnConInterpreter | Reaction Image Parsing | Assigns condition roles to extracted condition text |

:rocket: Using the code for ChemEAGLE

Using the code

Clone the following repositories:

git clone https://github.com/CYF2000127/ChemEagle

Option A: Using Azure OpenAI (Cloud-based)

First create and activate a conda environment with the following command in a Linux, Windows, or MacOS environment (Linux is the most recommended):

conda create -n chemeagle python=3.10
conda activate chemeagle

Then install requirements:

pip install -r requirements.txt

Download the necessary models and put in the main path.
Set up your Azure OpenAI API key in your environment. Here are two detailed tutorials (Chinese Version, English Version) on how to obtain the Azure OpenAI API key and endpoint (Remember to use the API key and the endpoint in the Azure AI Studio).

export API_KEY=your-azure-openai-api-key
export AZURE_ENDPOINT=your-azure-endpoint
export API_VERSION=your-api-version

Run the following code to extract machine-readable chemical data from chemical graphics:

from main import ChemEagle
image_path = './examples/1.png'
results = ChemEagle(image_path)
print(results)

Alternatively, run the following code to extract machine-readable chemical data from chemical literature (PDF files) directly:

import os
from main import ChemEagle
from pdf_extraction import run_pdf
pdf_path   = 'your/pdf/path'
output_dir = 'your/output/dir'
run_pdf(pdf_dir=pdf_path, image_dir=output_dir)
results = []
for fname in sorted(os.listdir(output_dir)):
    if not fname.lower().endswith('.png'):
        continue
    img_path = os.path.join(output_dir, fname)
    try:
        r = ChemEagle(img_path)
        r['image_name'] = fname
        results.append(r)
    except Exception as e:
        results.append({'image_name': fname, 'error': str(e)})
print(results)

Option B: Using ChemEagle_OS (Local Deployment with vLLM)

ChemEagle_OS is an open-source version that runs locally using vLLM, eliminating the need for cloud API keys.

Prerequisites

NVIDIA GPU with CUDA support (recommended)
Docker installed (for Windows vLLM deployment)
Download the Qwen3-VL series model weights (We provided Qwen3-VL-32B-Instruct-AWQ) from HuggingFace.

Setup Python Environment

conda create -n chemeagle python=3.10
conda activate chemeagle
pip install -r requirements.txt

Download the necessary models and put in the main path.
Deploy vLLM Server

For Linux:

pip install vllm
vllm serve /path/to/Qwen3-VL-32B-Instruct-AWQ \
    --port 8000 \
    --trust-remote-code \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --max-model-len 27200 \
    --limit-mm-per-prompt video=0

For Windows (PowerShell):

docker run -d --gpus all `
    -p 8000:8000 `
    -v /path/to/Qwen3-VL-32B-Instruct-AWQ:/models/Qwen3-VL-32B-Instruct-AWQ `
    --name vllm-server `
    vllm/vllm-openai:latest `
    --model /models/Qwen3-VL-32B-Instruct-AWQ `
    --port 8000 `
    --trust-remote-code `
    --enable-auto-tool-choice `
    --tool-call-parser hermes `
    --max-model-len 27200 `
    --limit-mm-per-prompt.video 0

Note:

Replace /path/to/Qwen3-VL-32B-Instruct-AWQ with your actual model path.
The vLLM server will be available at http://localhost:8000/v1 by default.

After the vLLM server is running, you can use the open source version of ChemEAGLE as follows:

from main import ChemEagle_OS

# Using default local vLLM server (http://localhost:8000/v1)
image_path = './examples/1.png'
results = ChemEagle_OS(image_path)
print(results)

Alternatively, run the following code to extract machine-readable chemical data from chemical literature (PDF files) directly:

import os
from main import ChemEagle_OS
from pdf_extraction import run_pdf
pdf_path   = 'your/pdf/path'
output_dir = 'your/output/dir'
run_pdf(pdf_dir=pdf_path, image_dir=output_dir)
results = []
for fname in sorted(os.listdir(output_dir)):
    if not fname.lower().endswith('.png'):
        continue
    img_path = os.path.join(output_dir, fname)
    try:
        r = ChemEagle_OS(img_path)
        r['image_name'] = fname
        results.append(r)
    except Exception as e:
        results.append({'image_name': fname, 'error': str(e)})
print(results)

Benchmarking

Benchmark datasets, predictions, and ground truth can be found in our Huggingface Repo

🤗 Chemical information extraction using ChemEAGLE.Web

Go to our ChemEAGLE.Web app demo to directly use our tool online for both image and PDF input! Feel free to provide us with any feedback, too! (Note: The demo runs on the HPC4.ust.hk server with a maximum uptime of 3 days; it is restarted for maintenance every three days, please wait a moment if the site is temporarily unavailable.)

When the input is a multimodal chemical reaction graphic: ![visualiz

Related Skills

node-connect

351.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

351.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

351.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。