SkillAgentSearch skills...

DeepCubeAI

Learning Discrete World Models for Heuristic Search

Install / Use

/learn @misaghsoltani/DeepCubeAI

README

DeepCubeAI

Publication image image Python Versions <br/> Pixi Badge Ruff Checked with Pyright Static Badge Build & Publish

<br/>

This repository contains the code for the paper Learning Discrete World Models for Heuristic Search, accepted to the first Reinforcement Learning Conference (RLC 2024).

| Rubik's Cube solving animation | Sokoban puzzle solving animation | Ice Slider puzzle solving animation | Digit Jump puzzle solving animation | | :------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: |

About DeepCubeAI

DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, please refer to the paper. <!-- [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html). -->

Quick links

Key Contributions

Overview

DeepCubeAI is comprised of three key components:

  1. Discrete World Model

    • Learns a world model that represents states in a discrete latent space.
    • This approach tackles two challenges: model degradation and state re-identification.
    • Prediction errors less than 0.5 are corrected by rounding.
    • Re-identifies states by comparing two binary vectors.
    <br/>

    | DeepCubeAI discrete world model | | :----------------------------------------------------------------------------------------------------------------------------------------: |

  2. Generalizable Heuristic Function

    • Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a goal-conditioned heuristic function that generalizes over start and goal states.
  3. Optimized Search

    • Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses Q* search, a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning. ‌

Main Results

  • Accurate reconstruction of ground truth images after thousands of timesteps.
  • Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
  • 99.9% success on Rubik's Cube with reversed start/goal states.
  • Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.

Quick start

DeepCubeAI provides a Python package and CLI. You can install it from PyPI or build it from source. The package supports Python 3.10-3.12.

[!NOTE] You can find detailed installation instructions, including using Conda for environment management, in the installation guide.

Install deepcubeai Package from PyPI with uv (Recommended if Running as a Package)

deepcubeai is available on PyPI and you can use the following commands to install it using uv.

  1. Install uv from the official website: Install uv.

  2. Create and activate a virtual environment:

    # create a .venv in the current folder
    uv venv
    
    # macOS & Linux
    source .venv/bin/activate
    
    # Windows (PowerShell)
    .venv\Scripts\activate
    

    If you have multiple Python versions, ensure you use a supported one (3.10-3.12), e.g.:

    uv venv --python 3.12
    
  3. Install the package (using uv’s pip interface):

    uv pip install deepcubeai
    

Install from Source with Pixi (Recommended if Working from Source)

Pixi is a package management tool that provides fast, reproducible environments with support for Conda and PyPI dependencies. The pixi.toml and pixi.lock files define reproducible environments with exact dependency versions.

  1. Install Pixi: Follow the official installation guide

  2. Clone repository:

    git clone https://github.com/misaghsoltani/DeepCubeAI.git
    cd DeepCubeAI
    
  3. Install the environment: Install the environment of your choice (default is default):

    pixi install  # or: pixi install -e default
    
    # Or the dev environment with additional dev dependencies:
    pixi install -e dev
    

    You may also install all environments at once:

    pixi install --all
    
  4. Enter the environment: First run may perform dependency resolution if the environment is not already installed:

    pixi shell  # or: pixi shell -e default
    
    # or for the dev environment:
    pixi shell -e dev
    

[!NOTE] There is also an environment named all, which installs all dependencies from every environment into a single environment. This differs from installing all environments separately.

  • The command pixi install -e all installs the environment named all.
  • The command pixi install --all installs each environment separately (i.e., default, dev, build, glibc217, all, and cuda).

Running DeepCubeAI

For running the CLI use the following command to see the available options:

# If already entered the environment with Pixi:
deepcubeai --help  # or -h

# or without entering the environment:
pixi run deepcubeai --help  # or -h

Or use it as a Python package:

import deepcubeai

print(deepcubeai.__version__)

License

MIT License - see LICENSE.

Citation

If you use DeepCubeAI in your research, please cite:

@article{agostinelli2025learning,
    title={Learning Discrete World Models for Heuristic Search},
    author={Agostinelli, Forest and Soltani, Misagh},
    journal={Reinforcement Learning Journal},
    volume={4},
    pages={1781--1792},
    year={2025}
}

Contact

If you have any questions or issues, please contact Misagh Soltani (msoltani@email.sc.edu)

View on GitHub
GitHub Stars10
CategoryEducation
Updated2mo ago
Forks1

Languages

Python

Security Score

95/100

Audited on Jan 28, 2026

No findings