🤖 StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation

🚀 Quick Start

🛠️ Installation

Create and activate the conda environment:

conda create -n stamo python=3.10 -y
conda activate stamo

Install the package:
```
cd StaMo && pip install -e .
```

🎯 Usage

🎨 Diffusion AutoEncoder

📊 Step 1: Data Format Conversion

Download robotic data in advance and extract them into image format
Convert to JSON format using our provided script:
```
python scripts/create_jsons.py
```

🏋️ Step 2: Model Training

Configure your setup (optional):
- Modify configuration files according to your VRAM requirements
- Adjust training parameters as needed
Start training:
```
bash scripts/train_libero.sh
```
Monitor training progress:
```
tensorboard --logdir .
```

📈 Step 3: Validation

Validate your trained model and results:

python validate_renderer.py

📚 Citation

If you use this work in your research, please cite our paper:

@article{liu2025stamo,
  title={StaMo: Unsupervised Learning of Generalizable Robotic Motions from Static Images},
  author={Liu, Mingyu and Shu, Jiuhe and Chen, Hui and Li, Zeju and Zhao, Canyu and Yang, Jiange and Gao, Shenyuan and Chen, Hao and Shen, Chunhua},
  journal={arXiv preprint arXiv:2510.05057},
  year={2025}
}

@article{zhao2024moviedreamer,
  title={Moviedreamer: Hierarchical generation for coherent long visual sequence},
  author={Zhao, Canyu and Liu, Mingyu and Wang, Wen and Chen, Weihua and Wang, Fan and Chen, Hao and Zhang, Bo and Shen, Chunhua},
  journal={arXiv preprint arXiv:2407.16655},
  year={2024}
}

🎫 License

For academic use, this project is licensed under the 2-clause BSD License. For commercial use, please contact Chunhua Shen.

StaMo

Install / Use

README