SkillAgentSearch skills...

Chart2Code

We introduce Chart2Code, the first user-driven, hierarchical benchmark that systematically evaluates Large Multimodal Models on chart-to-code tasks of increasing difficulty.

Install / Use

/learn @CSU-JPG/Chart2Code
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center" style="margin:0"><img src="./assets/logo2.png" alt="agent" style="width: 45%" /></p> <h2 align="center">From Charts to Code: A Hierarchical Benchmark for Multimodal Models</h2> <h5 align="center"> Welcome to Chart2Code! If you find this repo useful, please give a star ⭐ for encouragement. </h5> <h5 align="center"> <!-- English | [中文](README_zh.md) -->

arXiv Project Page Code Hugging Face Dataset

</h5>

🌟 Overview

Chart2Code-Benchmark is a new benchmark designed to evaluate chart generation capabilities of LMMs under progressively challenging conditions.

<p align="center" style="margin:0"><img src="./assets/figure1.png" alt="agent" style="width: 85%" /></p>

Chart2Code covers three progressively challenging levels: reproduction, editing, and long-table to chart generation. <br>Level1(<b>Chart Reproduction</b>) reproduces charts from a reference figure and user query; <br>Level2(<b>Chart Editing</b>) involves complex modifications such as changing chart types or adding elements;<br>Level3(<b>Long-Table to Chart Generation</b>) requires models to transform long, information-dense tables into faithful charts following user instructions.

More details about Chart2Code in project page.🌐

🚀 Quick Start

Here we provide a quick start guide to evaluate LMMs on Chart2Code.

Setup Environment

git clone https://github.com/showlab/Chart2Code.git
conda env create -f environment.yaml 
# This environment can be used to evaluate all models. 
# For inference, it supports all models except for Qwen3 and Molmo.
# You should install the environment from the official Qwen3-VL 
# repository https://github.com/QwenLM/Qwen3-VL. 
# For Molmo, we recommend first installing TensorFlow with 
# `python -m pip install "tensorflow==2.17.*"` before following 
# the installation instructions in the official 
# repository https://github.com/allenai/molmo.
conda activate chart2code
cd Chart2Code

Setup API key and API base URL in .env for different LMMs. Claude、 Gemini and GPT are accessed through API proxy providers,while Seed is accessed through ARK API.

OPENAI_API_KEY=${your_api_proxy_provider_key}
ARK_API_KEY=${your_ark_api_key}
OPENAI_API_URL=${your_api_proxy_provider_url}
ARK_BASE_URL=${your_ark_api_base_url}

Download Data

Download the Chart2Code data from Huggingface and unzip it under the root directory.

wget https://huggingface.co/datasets/CSU-JPG/Chart2Code/resolve/main/data.zip
unzip data.zip

The file structure should be like this:

├── data
│   ├── level1_direct
│   │   ├── 3d_1.png
│   │   ├── 3d_1.py
│   │   └── ...
│   ├── level1_figure
│   │   ├── fig1_density_2
│   │   ├── ...
│   └── level1_customize
│       ├── table_1_instruction_2.png
│       ├── table_1_instruction_2.py
│       ├── table_1_instruction_2_request.txt
│       └── table_1_instruction_2_data.txt
│       └── ...
│   ├── level2
│   │   ├── bar_1_v1.png
│   │   ├── bar_1_v1.py
│   │   ├── bar_1_v1_data.txt
│   │   └── ...
│   └── level3
│       ├── table_1.xlsx
│       ├── table1_1.png
│       ├── table1_1_generate.py
│       ├── table1_1.txt
│       ├── table1_1_generate.png
│       └── ...
│   ├── level1_direct.json
│   ├── level1_figure.json
│   ├── level1_customize.json
│   ├── level2.json
│   └── level3.json
│—— Evaluation
└── ...

Inference Setup

Inference for each benchmark level is handled by a dedicated shell script located in the scripts/ directory. You must specify a model for each run. You can do this in two ways:

  • Pass it as an argument (Recommended): Provide the MODEL_IDENTIFIER directly when executing the script.
  • Edit the script: Set the MODEL_IDENTIFIER variable inside the corresponding .sh file.

You can modify the LOAD_SOURCE parameter in the shell script to select how the model is loaded:

  • local: By default, the model will be loaded from the Inference/models directory.
  • hub: The model weights will be loaded directly from the Hugging Face Hub online.

You can also adjust other parameters like GPU_VISIBLE_DEVICES in the script to fit your hardware setup.

cd scripts/inference
# For level1_customize
bash inference_customize.sh qwen3_customize_30B
# For level1_direct
bash inference_direct.sh qwen2.5_direct_72B
# For level1_figure
bash inference_figure.sh InternVL_3.5_figure_38B
# For level2
bash inference_level2.sh deepseek_level2
# For level3
bash inference_level3.sh gpt_5_level3
<details> <summary>Available Models</summary> We now support the following models: <table> <tr> <th rowspan="2">Model Name</th> <th colspan="5">MODEL_IDENTIFIER</th> </tr> <tr> <th>level1_customize</th> <th>level1_direct</th> <th>level1_figure</th> <th>level2</th> <th>level3</th> </tr> <!-- InternVL 3.5 系列 --> <tr> <td>InternVL-3.5-38B</td> <td>InternVL_3.5_customize_38B</td> <td>InternVL_3.5_direct_38B</td> <td>InternVL_3.5_figure_38B</td> <td>InternVL_3.5_level2_38B</td> <td>InternVL_3.5_level3_38B</td> </tr> <tr> <td>InternVL-3.5-8B</td> <td>InternVL_3.5_customize_8B</td> <td>InternVL_3.5_direct_8B</td> <td>InternVL_3.5_figure_8B</td> <td>InternVL_3.5_level2_8B</td> <td>InternVL_3.5_level3_8B</td> </tr> <!-- InternVL 3 系列 --> <tr> <td>InternVL-3-38B</td> <td>InternVL_3_customize_38B</td> <td>InternVL_3_direct_38B</td> <td>InternVL_3_figure_38B</td> <td>InternVL_3_level2_38B</td> <td>InternVL_3_level3_38B</td> </tr> <tr> <td>InternVL-3-8B</td> <td>InternVL_3_customize_8B</td> <td>InternVL_3_direct_8B</td> <td>InternVL_3_figure_8B</td> <td>InternVL_3_level2_8B</td> <td>InternVL_3_level3_8B</td> </tr> <!-- InternVL 2.5 系列 --> <tr> <td>InternVL-2.5-38B</td> <td>InternVL_2.5_customize_38B</td> <td>InternVL_2.5_direct_38B</td> <td>InternVL_2.5_figure_38B</td> <td>InternVL_2.5_level2_38B</td> <td>InternVL_2.5_level3_38B</td> </tr> <tr> <td>InternVL-2.5-8B</td> <td>InternVL_2.5_customize_8B</td> <td>InternVL_2.5_direct_8B</td> <td>InternVL_2.5_figure_8B</td> <td>InternVL_2.5_level2_8B</td> <td>InternVL_2.5_level3_8B</td> </tr> <!-- Qwen 系列 --> <tr> <td>Qwen3-VL-30B</td> <td>qwen3_customize_30B</td> <td>qwen3_direct_30B</td> <td>qwen3_figure_30B</td> <td>qwen3_level2_30B</td> <td>qwen3_level3_30B</td> </tr> <tr> <td>Qwen3-VL-30B-think</td> <td>qwen3_customize_30B_think</td> <td>qwen3_direct_30B_think</td> <td>qwen3_figure_30B_think</td> <td>qwen3_level2_30B_think</td> <td>qwen3_level3_30B_think</td> </tr> <!-- Qwen2.5 系列 --> <tr> <td>Qwen2.5-VL-72B</td> <td>qwen2.5_customize_72B</td> <td>qwen2.5_direct_72B</td> <td>qwen2.5_figure_72B</td> <td>qwen2.5_level2_72B</td> <td>qwen2.5_level3_72B</td> </tr> <tr> <td>Qwen2.5-VL-7B</td> <td>qwen2.5_customize_7B</td> <td>qwen2.5_direct_7B</td> <td>qwen2.5_figure_7B</td> <td>qwen2.5_level2_7B</td> <td>qwen2.5_level3_7B</td> </tr> <!-- Qwen2 系列 --> <tr> <td>Qwen2-VL-72B</td> <td>qwen2_customize_72B</td> <td>qwen2_direct_72B</td> <td>qwen2_figure_72B</td> <td>qwen2_level2_72B</td> <td>qwen2_level3_72B</td> </tr> <tr> <td>Qwen2-VL-7B</td> <td>qwen2_customize_7B</td> <td>qwen2_direct_7B</td> <td>qwen2_figure_7B</td> <td>qwen2_level2_7B</td> <td>qwen2_level3_7B</td> </tr> <!-- MOLMO 系列 --> <tr> <td>MOLMO-7B-D</td> <td>molmo_customize_7BD</td> <td>molmo_direct_7BD</td> <td>molmo_figure_7BD</td> <td>molmo_level2_7BD</td> <td>molmo_level3_7BD</td> </tr> <!-- MIMO 系列 --> <tr> <td>MIMO-VL-7B-RL-think</td> <td>mimo_RL_customize_think</td> <td>mimo_RL_direct_think</td> <td>mimo_RL_figure_think</td> <td>mimo_RL_level2_think</td> <td>mimo_RL_level3_think</td> </tr> <tr> <td>MIMO-VL-7B-RL-nothink</td> <td>mimo_RL_customize_nothink</td> <td>mimo_RL_direct_nothink</td> <td>mimo_RL_figure_nothink</td> <td>mimo_RL_level2_nothink</td> <td>mimo_RL_level3_nothink</td> </tr> <tr> <td>MIMO-VL-7B-SFT-nothink</td> <td>mimo_SFT_customize_nothink</td> <td>mimo_SFT_direct_nothink</td> <td>mimo_SFT_figure_nothink</td> <td>mimo_SFT_level2_nothink</td> <td>mimo_SFT_level3_nothink</td> </tr> <tr> <td>MIMO-VL-7B-SFT-think</td> <td>mimo_SFT_customize_think</td> <td>mimo_SFT_direct_think</td> <td>mimo_SFT_figure_think</td> <td>mimo_SFT_level2_think</td> <td>mimo_SFT_level3_think</td> </tr> <!-- LLaVA 系列 --> <tr> <td>LLaVA-OV-Qwen2-7B-OV</td> <td>llava_ov_customize</td> <td>llava_ov_direct</td> <td>llava_ov_figure</td> <td>liava_ov_level2</td> <td>llava_ov_level3</td> </tr> <tr> <td>LLaVA-OV-Qwen2-7B-SI</td> <td>llava_si_customize</td> <td>llava_si_direct</td> <td>llava_si_figure</td> <td>llava_si_level2</td> <td>llava_si_level3</td> </tr> <!-- SEED 系列 --> <tr> <td>SEED-1.6-VL</td> <td>seed_1.6_customize</td> <td>seed_1.6_direct</td> <td>seed_1.6_figure</td> <td>seed_1.6_level2</td>
View on GitHub
GitHub Stars24
CategoryDevelopment
Updated1mo ago
Forks0

Languages

Python

Security Score

75/100

Audited on Feb 9, 2026

No findings