GLEAM

[ICCV 2025] GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

Generate Convert Improve

Install / Use

/learn @zjwzcx/GLEAM

About this skill

Quality Score

0/100

README

<p align="center"> <h1 align="center"><strong>GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene</strong></h1> <p align="center"> <strong>ICCV 2025</strong><br> <a href='https://xiao-chen.tech/' target='_blank'>Xiao Chen</a>&emsp; <a href='https://tai-wang.github.io/' target='_blank'>Tai Wang</a>&emsp; <a href='https://quanyili.github.io/' target='_blank'>Quanyi Li</a>&emsp; <a href='https://taohuang13.github.io/' target='_blank'>Tao Huang</a>&emsp; <a href='https://oceanpang.github.io/' target='_blank'>Jiangmiao Pang</a>&emsp; <a href='https://tianfan.info/' target='_blank'>Tianfan Xue</a>&emsp; <br> The Chinese University of Hong Kong&emsp;Shanghai AI Laboratory <br> </p> </p> <div id="top" align="center">

</div>

🏠 About

Generalizable active mapping in complex unknown environments remains a critical challenge for mobile robots. Existing methods, constrained by insufficient training data and conservative exploration strategies, exhibit limited generalizability across scenes with diverse layouts and complex connectivity. To enable scalable training and reliable evaluation, we introduce GLEAM-Bench, the first large-scale benchmark designed for generalizable active mapping with 1,152 diverse 3D scenes from synthetic and real-scan datasets. Building upon this foundation, we propose GLEAM, a unified generalizable exploration policy for active mapping. Its superior generalizability comes mainly from our semantic representations, long-term navigable goals, and randomized strategies. It significantly outperforms state-of-the-art methods, achieving 66.50% coverage (+9.49%) with efficient trajectories and improved mapping accuracy on 128 unseen complex scenes.

📊 Dataset

GLEAM-Bench includes 1,152 diverse 3D scenes from synthetic and real-scan datasets for benchmarking generalizable active mapping policies. These curated scene meshes are characterized by near-watertight geometry, diverse floorplan (≥10 types), and complex interconnectivity. We unify these multi-source datasets through filtering, geometric repair, and task-oriented preprocessing. Please refer to the guide for more details and scrips.

We provide all the preprocessed data used in our work, including mesh files (in obj folder), ground-truth surface points (in gt folder) and asset indexing files (in urdf folder). We recommend users fill out the form to access the download link [HERE]. The directory structure should be as follows.

GLEAM
├── README.md
├── gleam
│   ├── train
│   ├── test
│   ├── ...
├── data_gleam
│   ├── README.md
│   ├── train_stage1_512
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
│   ├── train_stage2_512
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
│   ├── eval_128
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
├── ...

🛠️ Installation

We test our code under the following environment:

NVIDIA RTX 3090/4090 (24GB VRAM)
NVIDIA Driver: 545.29.02
Ubuntu 20.04
CUDA 11.8
Python 3.8.12
PyTorch 2.0.0+cu118

Clone this repository.

git clone https://github.com/zjwzcx/GLEAM
cd GLEAM

Create an environment and install PyTorch.

conda create -n gleam python=3.8 -y
conda activate gleam
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

NVIDIA Isaac Gym Installation: https://developer.nvidia.com/isaac-gym/download

cd isaacgym/python
pip install -e .

Install GLEAM.

pip install -r requirements.txt
pip install -e .

🕹️ Training & Evaluation

Weights & Bias (wandb) is highly recommended for analyzing the training logs. If you want to use wandb in our codebase, please paste your wandb API key into wandb_utils/wandb_api_key_file.txt. If you don't want to use wandb, please add --stop_wandb into the following command.

We provide the standard checkpoints of GLEAM [HERE]. Please use the 40k-step checkpoint as the standard. We also provide the Stage 2 checkpoints excluding the 96 Gibson scenes, as this exclusion made the model more robust and stable overall.

Training

Please run the following command to reproduce the standard two-stage training of GLEAM.

Stage 1 with 512 scenes:

python gleam/train/train_gleam_stage1.py --sim_device=cuda:0 --num_envs=32 --headless

Stage 2 with additional 512 scenes, continually trained based on the pretrained checkpoint (specified by --ckpt_path) from stage 1. Take our released checkpoint as example, ckpt_path should be runs/train_gleam_stage1/models/rl_model_40000000_steps.zip.

python gleam/train/train_gleam_stage2.py --sim_device=cuda:0 --num_envs=32 --headless --ckpt_path=${YOUR_CKPT_PATH}$

Customized Training Environments

If you want to customize a novel training environment, you need to create your environment and configuration files in gleam/env and then define the task in gleam/__init__.py.

Evaluation

Please run the following command to evaluate the generalization performance of GLEAM on 128 unseen scenes from the test set of GLEAM-Bench. The users should specify the checkpoint via --ckpt_path.

python gleam/test/test_gleam_gleambench.py --sim_device=cuda:0 --num_envs=32 --headless --stop_wandb --ckpt_path=${YOUR_CKPT_PATH}$

Main Results

📝 TODO List

[x] Release GLEAM-Bench (dataset) and the arXiv paper in May 2025.
[x] Release the training code in May 2025.
[x] Release the evaluation code in June 2025.
[x] Release the key scripts in June 2025.
[x] Release the pretrained checkpoint in June 2025.

🤔 FAQ

Q: Is it normal that the program gets stuck for about 5-60 minutes during training and testing?
A: This is normal because the simulation environment needs to load 1024 complex 3D scenes (for training) or 128 complex 3D scenes (for evaluation) at once, which is very time-consuming. For initial use, it is recommended to modify the hardcoded parameters (number of training scenes for stage1 and number of evaluation scenes) to reduce the number of loaded scenes for a quick run-through.

Q: Is it normal that the 3D scenes in the visualization UI have no textures?
A: This is normal. Textures have been removed from the preprocessed data to speed up simulation and rendering, as RGB/texture information is not required for geometry-level exploration. It's a trade-off to accelerate training, allowing focus on 3D spatial exploration. If you need scenes with textures, we recommend downloading the raw version of GLEAM-Bench. Please refer to the guide for more details.

🔗 Citation

If you find our work helpful, please cite it:

@article{chen2025gleam,
  title={GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scenes},
  author={Chen, Xiao and Wang, Tai and Li, Quanyi and Huang, Tao and Pang, Jiangmiao and Xue, Tianfan},
  journal={arXiv preprint arXiv:2505.20294},
  year={2025}
}

If you use our codebase, dataset, and benchmark, please kindly cite the original datasets involved in our work. BibTex entries are provided below.

<details><summary>Dataset BibTex</summary>

@article{ai2thor,
  author={Eric Kolve and Roozbeh Mottaghi and Winson Han and
          Eli VanderBilt and Luca Weihs and Alvaro Herrasti and
          Daniel Gordon and Yuke Zhu and Abhinav Gupta and
          Ali Farhadi},
  title={{AI2-THOR: An Interactive 3D Environment for Visual AI}},
  journal={arXiv},
  year={2017}
}

@inproceedings{chen2024gennbv,
  title={GenNBV: Generalizable Next-Best-View Policy for Act

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

flutter-tutor

Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

workshop-rules

Materials used to teach the summer camp <Data Science for Kids>