Text2CAD

[NeurIPS'24 Spotlight] Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts

Generate Convert Improve

Install / Use

/learn @SadilKhan/Text2CAD

About this skill

Quality Score

0/100

README

Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts

Mohammad Sadil Khan* · Sankalp Sinha* · Talha Uddin Sheikh · Didier Stricker · Sk Aziz Ali · Muhammad Zeshan Afzal

*equal contributions

<h2> NeurIPS 2024 (Spotlight 🤩) </h2> <a href="https://arxiv.org/abs/2409.17106"> <img src="https://img.shields.io/badge/Arxiv-3498db?style=for-the-badge&logoWidth=40&logoColor=white&labelColor=2c3e50&borderRadius=10" alt="Arxiv" /> </a> <a href="https://sadilkhan.github.io/text2cad-project/"> <img src="https://img.shields.io/badge/Project-2ecc71?style=for-the-badge&logoWidth=40&logoColor=white&labelColor=27ae60&borderRadius=10" alt="Project" /> </a> <a href="https://huggingface.co/datasets/SadilKhan/Text2CAD"> <img src="https://img.shields.io/badge/Dataset-7D5BA6?style=for-the-badge&logoWidth=40&logoColor=white&labelColor=27ae60&borderRadius=10" alt="Dataset" /> </a> </div>

⚙️ Installation

🌍 Environment

🐧 Linux
🐍 Python >=3.9

📦 Dependencies

$ conda env create --file environment.yml

✅ Todo List

[x] Release Data Preparation Code
[x] Release Training Code
[x] Release Inference Code

📊 Data Preparation

Download the DeepCAD data from here.

Generate Vector Representation from DeepCAD Json

You can also download the processed cad vec from here.

$ cd CadSeqProc
$  python3 json2vec.py --input_dir $DEEPCAD_JSON --split_json $TRAIN_TEST_VAL_JSON --output_dir $OUTPUT_DIR --max_workers $WORKERS --padding --deduplicate

Download the text annotations from here. Download the preprocessed training and validation data and place it in Cad_VLM/dataprep folder.

🚀 Training

In the Cad_VLM/config/trainer.yaml, provide the following path.

<details><summary>Required Updates in yaml</summary> <p>

cache_dir: The directory to load model weights from Huggingface.
cad_seq_dir: The root directory that contains the ground truth CAD vector.
prompt_path: Path for the text annotation.
split_filepath: Json file containing the UIDs for train, test or validation.
log_dir: Directory for saving logs, outputs, checkpoints.
checkpoint_path (Optional): For resuming training after some epochs.

</p> </details> <br>

$ cd Cad_VLM
$ python3 train.py --config_path config/trainer.yaml

🤖 Inference

For Test Dataset

In the Cad_VLM/config/inference.yaml, provide the following path. Download the checkpoint for v1.0 here.

<details><summary>Required Updates in yaml</summary> <p>

cache_dir: The directory to load model weights from Huggingface.
cad_seq_dir: The root directory that contains the ground truth CAD vector.
prompt_path: Path for the text annotation.
split_filepath: Json file containing the UIDs for train, test or validation.
log_dir: Directory for saving logs, outputs, checkpoints.
checkpoint_path: The path to model weights.

</p> </details> <br>

$ cd Cad_VLM
$ python3 test.py --config_path config/inference.yaml

Run Evaluation

$ cd Evaluation
$ python3 eval_seq.py --input_path ./output.pkl --output_dir ./output

For Random Text Prompts

In the Cad_VLM/config/inference_user_input.yaml, provide the following path.

<details><summary>Required Updates in yaml</summary> <p>

cache_dir: The directory to load model weights from Huggingface.
log_dir: Directory for saving logs, outputs, checkpoints.
checkpoint_path: The path to model weights.
prompt_file (Optional): For single prompt ignore it, for multiple prompts provide a txt file.

</p> </details> <br>

For single prompt

$ cd Cad_VLM
$ python3 test_user_input.py --config_path config/inference_user_input.yaml --prompt "A rectangular prism with a hole in the middle."

For Multiple prompts

$ cd Cad_VLM
$ python3 test_user_input.py --config_path config/inference_user_input.yaml

💻 Run Demo

In the Cad_VLM/config/inference_user_input.yaml, provide the following path.

<details><summary>Required Updates in yaml</summary> <p>

cache_dir: The directory to load model weights from Huggingface.
log_dir: Directory for saving logs, outputs, checkpoints.
checkpoint_path: The path to model weights.

</p> </details> <br>

$ cd App
$ gradio app.py

👥 Contributors

Our project owes its success to the invaluable contributions of these remarkable individuals. We extend our heartfelt gratitude for their dedication and support.

✍🏻 Acknowledgement

We thank the authors of DeepCAD and SkexGen and acknowledge the use of their code.

📜 Citation

If you use this dataset in your work, please consider citing the following publications.

@inproceedings{text2cad,
	author = {Khan, Mohammad Sadil and Sinha, Sankalp and Sheikh, Talha Uddin and Stricker, Didier and Ali, Sk Aziz and Afzal, Muhammad Zeshan},
	booktitle = {Advances in Neural Information Processing Systems},
	editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
	pages = {7552--7579},
	publisher = {Curran Associates, Inc.},
	title = {Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts},
	url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/0e5b96f97c1813bb75f6c28532c2ecc7-Paper-Conference.pdf},
	volume = {37},
	year = {2024},
	bdsk-url-1 = {https://proceedings.neurips.cc/paper_files/paper/2024/file/0e5b96f97c1813bb75f6c28532c2ecc7-Paper-Conference.pdf}}

Related Skills

clearshot

Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.

openpencil

2.2k

The world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.

HappyColorBlend

HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to

Flyaro-waffle-app

Waffle Delight - Full Stack MERN Application Rules & Documentation Project Overview A comprehensive waffle delivery application built with MERN stack featuring premium UI/UX, admin management, a

SadilKhan

View profile

View on GitHub

GitHub Stars383

CategoryDesign

Updated16h ago

Forks63

SadilKhan/Text2CAD

Languages

Python

Security Score

85/100

Audited on Apr 9, 2026

No findings