SkillAgentSearch skills...

MaxMI

A Maximal Mutual Information Criterion for Manipulation Concept Discovery

Install / Use

/learn @PeiZhou26/MaxMI
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

MaxMI

This is the official repository for: MaxMI: A Maximal Mutual Information Criterion for Manipulation Concept Discovery

Installation

  1. Clone the repository:

    git clone https://github.com/PeiZhou26/MaxMI.git
    cd MaxMI
    
  2. Create the Conda environment using the environment.yml file:

    conda env create -f environment.yml
    
  3. Activate the environment:

    conda activate maxmi
    

Tasks

The current code supports four tasks from the ManiSkill2 (v0.4.2) benchmark: PickCube-v0, StackCube-v0, PegInsertionSide-v0, and TurnFaucet-v0.

Data Preparation

The behavior cloning datasets can be accessed via this link. Each task includes approximately 1,000 successful demonstrations; however, we use a randomly sampled subset of 500 for our experiments.

After downloading the datasets, place them in the /data directory. To evaluate the intermediate task success rate, the ManiSkill2 environment requires patching (see /maniskill2_patches for details).

For further information, please refer to the CoTPC repository and official ManiSkill2 documentation.

Training & Evaluation

For key state discovery, which involves a differentiable mutual information estimator, we utilize the off-the-shelf InfoNet. The parameters of InfoNet are kept frozen. Download the pretrained InfoNet model and place the checkpoint in your directory. Then, update the checkpoint path in /src/infer_infonet.py with your own path.

The script /src/concept_train.py provides an example of key state discovery and saves the trained key state localization network. After training, use /src/concept_eval.py to label key states from the demonstrations and store the key state labels in a .pkl file.

 python /src/concept_train.py

After obtaining the automatically labeled key states, we use them to train a manipulation policy for each task. We build on Chain-of-Thought Predictive Control (CoTPC) as the foundation of our policy, which simultaneously optimizes both key state prediction and next action prediction. To train the policy, use /src/train.py, and to evaluate the performance of the trained policy, use /src/eval.py. For detailed examples of training and testing, refer to /scripts/train.sh and /scripts/eval.sh.

 bash /scripts/train.sh

Acknowledgement

We would like to express our gratitude to CoTPC and InfoNet for providing the code base that significantly assisted in the development of our program.

View on GitHub
GitHub Stars13
CategoryDevelopment
Updated6mo ago
Forks2

Languages

Python

Security Score

82/100

Audited on Oct 6, 2025

No findings