RoboEXP

[CoRL 2024] RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation

Generate Convert Improve

Install / Use

/learn @Jianghanxiao/RoboEXP

About this skill

Quality Score

0/100

README

RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation

<a target="_blank" href="https://jianghanxiao.github.io/">Hanxiao Jiang</a>1,2, <a target="_blank" href="https://binghao-huang.github.io/">Binghao Huang</a>1, <a target="_blank" href="https://warshallrho.github.io/">Ruihai Wu</a>4, <a target="_blank" href="https://sg.linkedin.com/in/zhuoran-li-david">Zhuoran Li</a>5, <a target="_blank" href="https://www.gargshubham.com/">Shubham Garg</a>3, <a target="_blank" href="https://www.amazon.science/author/hooshang-nayyeri">Hooshang Nayyeri</a>3, <a target="_blank" href="https://shenlong.web.illinois.edu/">Shenlong Wang</a>2, <a target="_blank" href="https://yunzhuli.github.io/">Yunzhu Li</a>1

1Columbia University, 2University of Illinois Urbana-Champaign, 3Amazon, 4Peking University, 5National University of Singapore

Website | Paper | Colab | Video

Overview

This repository contains the official implementation of the RoboEXP system for the interactive exploration task. <video autoplay muted loop height="100%" width="100%" controls src="https://github.com/Jianghanxiao/proj-RoboExp/assets/32015651/14cf19c6-5da0-4b84-a3a5-e4383396e3b9"></video>

Try it in Colab!

In this notebook, we replay an example exploration process using the saved observations and cached decisions.

Setup

# Here we use cuda-11.7
export CUDA_HOME=/usr/local/cuda-11.7/
# create conda environment
conda create -n roboexp python=3.8
conda activate roboexp

# Install the packages
bash scripts/install_env.sh

# download pretrained models
bash scripts/download_pretrained_models.sh

Interactive Exploration

To run our RoboEXP system, you also need to create the file my_apikey and copy your OpenAI API key into the file.

To calibrate the wrist camera, we provide the hand-eye calibration code in calibrate_wrist.py, and you can download our calibration board.

# Download the calibration board
gdown 1KWYncDGjtGthePC3wzCu9zBW0nZI0RBM
# Download the example calibration result
gdown 1b2yp45eJVyXnOg11OImTER3zMRUcaZRN

Run our RoboEXP system

# Set `visualize=True` to enable visualizations
python interactive_explore.py

Code Structure Explanation

roboexp is the main library for our RoboEXP system, comprising the following key components:

roboexp/env: This component provides the basic control API for the robot, camera, and calibration setup. It facilitates obtaining observations from the camera and controlling the robot.
roboexp/perception: This component offers an API for the perception module, enabling image processing to derive 2D semantic information from images.
roboexp/memory: This component provides an API for the memory module, allowing instance merging in 3D and the construction of our Action-Conditioned Scene Graph (ACSG).
roboexp/decision: This component offers an API for the decision module, facilitating decision-making based on observations.
roboexp/act: This component provides an API for the action module, enabling action planning based on decisions and our memory.

Citation

If you find this repo useful for your research, please consider citing the paper

@article{jiang2024roboexp,
    title={RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation},
    author={Jiang, Hanxiao and Huang, Binghao and Wu, Ruihai and Li, Zhuoran and Garg, Shubham and Nayyeri, Hooshang and Wang, Shenlong and Li, Yunzhu},
    journal={CoRL},
    year={2024}
}

Related Skills

node-connect

348.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

348.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

348.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。