SkillAgentSearch skills...

EmbodiedGen

Towards a Generative 3D World Engine for Embodied Intelligence

Install / Use

/learn @HorizonRobotics/EmbodiedGen
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence

📖 Documentation GitHub 📄 arXiv 🎥 Video 中文介绍

<!-- [![🌐 Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://horizonrobotics.github.io/robot_lab/embodied_gen/index.html) -->

🤗 Hugging Face 🤗 Hugging Face 🤗 Hugging Face 🤗 Hugging Face

EmbodiedGen is a generative engine to create diverse and interactive 3D worlds composed of high-quality 3D assets(mesh & 3DGS) with plausible physics, leveraging generative AI to address the challenges of generalization in embodied intelligence related research. It composed of six key modules: Image-to-3D, Text-to-3D, Texture Generation, Articulated Object Generation, Scene Generation and Layout Generation.

<img src="docs/assets/overall.jpg" alt="Overall Framework" width="700"/>

✨ Table of Contents of EmbodiedGen

📖 Documentation Follow the documentation to get started!

💬 Feedback Wanted: How Do You Use EmbodiedGen & What’s Missing?

🚀 Quick Start

📖 Documentation

✅ Setup Environment

git clone https://github.com/HorizonRobotics/EmbodiedGen.git
cd EmbodiedGen
git checkout v0.1.7
git submodule update --init --recursive --progress
conda create -n embodiedgen python=3.10.13 -y # recommended to use a new env.
conda activate embodiedgen
bash install.sh basic # around 20 mins
# Optional: `bash install.sh extra` for scene3d-cli

✅ Starting from Docker

We provide a pre-built Docker image on Docker Hub with a configured environment for your convenience. For more details, please refer to Docker documentation.

Note: Model checkpoints are not included in the image, they will be automatically downloaded on first run. You still need to set up the GPT Agent manually.

IMAGE=wangxinjie/embodiedgen:env_v0.1.x
CONTAINER=EmbodiedGen-docker-${USER}
docker pull ${IMAGE}
docker run -itd --shm-size="64g" --gpus all --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --privileged --net=host --name ${CONTAINER} ${IMAGE}
docker exec -it ${CONTAINER} bash

✅ Setup GPT Agent

Update the API key in file: embodied_gen/utils/gpt_config.yaml.

You can choose between two backends for the GPT agent:

  • gpt-4o (Recommended) – Use this if you have access to Azure OpenAI.
  • qwen2.5-vl – An alternative with free usage via OpenRouter, apply a free key here and update api_key in embodied_gen/utils/gpt_config.yaml (50 free requests per day)

📸 Directly use EmbodiedGen All-Simulators-Ready Assets

🤗 Hugging Face Explore EmbodiedGen generated assets that are ready for simulation across any simulators (SAPIEN, Isaac Sim, MuJoCo, PyBullet, Genesis, Isaac Gym etc.). Details in chapter any-simulators.


<h2 id="image-to-3d">🖼️ Image-to-3D</h2>

🤗 Hugging Face Generate physically plausible 3D asset URDF from single input image, offering high-quality support for digital twin systems. (HF space is a simplified demonstration. For the full functionality, please refer to img3d-cli.)

<img src="docs/assets/image_to_3d.jpg" alt="Image to 3D" width="700">

☁️ Service

Run the image-to-3D generation service locally. Models downloaded automatically on first run, please be patient.

# Run in foreground
python apps/image_to_3d.py
# Or run in the background
CUDA_VISIBLE_DEVICES=0 nohup python apps/image_to_3d.py > /dev/null 2>&1 &

⚡ API

Generate physically plausible 3D assets from image input via the command-line API.

img3d-cli --image_path apps/assets/example_image/sample_00.jpg apps/assets/example_image/sample_01.jpg \
--n_retry 2 --output_root outputs/imageto3d

# See result(.urdf/mesh.obj/mesh.glb/gs.ply) in ${output_root}/sample_xx/result

Support the use of SAM3D or TRELLIS as 3D generation model, modify IMAGE3D_MODEL in embodied_gen/scripts/imageto3d.py to switch model.


<h2 id="text-to-3d">📝 Text-to-3D</h2>

🤗 Hugging Face Create 3D assets from text descriptions for a wide range of geometry and styles. (HF space is a simplified demonstration. For the full functionality, please refer to text3d-cli.)

<img src="docs/assets/text_to_3d.jpg" alt="Text to 3D" width="700">

☁️ Service

Deploy the text-to-3D generation service locally.

Text-to-image model based on the Kolors model, supporting Chinese and English prompts. Models downloaded automatically on first run, please be patient.

python apps/text_to_3d.py

⚡ API

Text-to-image model based on SD3.5 Medium, English prompts only. Usage requires agreement to the model license(click accept), models downloaded automatically.

For large-scale 3D asset generation, set --n_image_retry=4 --n_asset_retry=3 --n_pipe_retry=2, slower but better, via automatic checking and retries. For more diverse results, omit --seed_img.

text3d-cli --prompts "small bronze figurine of a lion" "A globe with wooden base" "wooden table with embroidery" \
    --n_image_retry 1 --n_asset_retry 1 --n_pipe_retry 1 --seed_img 0 \
    --output_root outputs/textto3d

Text-to-image model based on the Kolors model.

bash embodied_gen/scripts/textto3d.sh \
    --prompts "A globe with wooden base and latitude and longitude lines" "橙色电动手钻,有磨损细节" \
    --output_root outputs/textto3d_k

ps: models with more permissive licenses found in embodied_gen/models/image_comm_model.py


<h2 id="texture-generation">🎨 Texture Generation</h2>

🤗 Hugging Face Generate visually rich textures for 3D mesh.

<img src="docs/assets/texture_gen.jpg" alt="Texture Gen" width="700">

☁️ Service

Run the texture generation service locally. Models downloaded automatically on first run, see download_kolors_weights, geo_cond_mv.

python apps/texture_edit.py

⚡ API

Support Chinese and English prompts.

texture-cli --mesh_path "apps/assets/example_texture/meshes/robot_text.obj" \
"apps/assets/example_texture/meshes/horse.obj" \
--prompt "举着牌子的写实风格机器人,大眼睛,牌子上写着“Hello”的文字" \
"A gray horse head with flying mane and brown eyes" \
--output_root "outputs/texture_gen" \
--seed 0

<h2 id="3d-scene-generation">🌍 3D Scene Generation</h2> <img src="docs/assets/scene3d.gif" alt="scene3d" style="width: 600px;">

⚡ API

Run bash install.sh extra to install additional requirements if you need to use scene3d-cli.

It takes ~30mins to generate a color mesh and 3DGS per scene.

CUDA_VISIBLE_DEVICES=0 scene3d-cli \
--prompts "Art studio with easel and canvas" \
--output_dir outputs/bg_scenes/ \
--seed 0 \
--gs3d.max_steps 4000 \
--disable_pano_check

<h2 id="articulated-object-generation">⚙️ Articulated Object Generation</h2>

See our paper published in NeurIPS 2025. [Arxiv Paper] | [Gradio Demo] | [Code]

<img src="docs/assets/articulate.gif" alt="articulate" style="width: 500px;">
<h2 id="layout-generation">🏞️ Layout(Interactive 3D Worlds) Generation</h2>

💬 Generate Layout from task description

<table> <tr> <td><img src="docs/assets/layout1.gif" alt="layout1" width="320"/></td> <td><img src="docs/assets/layout2.gif" alt="layout2" width="320"/></td> </tr> <tr> <td><img src="docs/assets/layout3.gif" alt="layout3" width="320"/></td> <td><img src="docs/assets/layout4.gif" alt="layout4" width="320"/></td> </tr> </table>

Text-to-

Related Skills

View on GitHub
GitHub Stars407
CategoryDevelopment
Updated6d ago
Forks24

Languages

Python

Security Score

80/100

Audited on Mar 31, 2026

No findings