Hnet
H-Net: Hierarchical Network with Dynamic Chunking
Install / Use
/learn @goombalab/HnetREADME
H-Net
<table width="100%"> <tr> <td><img src="assets/english.gif" alt="English" width="100%"></td> <td><img src="assets/code.gif" alt="Code" width="100%"></td> </tr> <tr> <td><img src="assets/chinese.gif" alt="Chinese" width="100%"></td> <td><img src="assets/korean.gif" alt="Korean" width="100%"></td> </tr> </table>Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
Sukjun Hwang, Brandon Wang, Albert Gu
Paper: https://arxiv.org/abs/2507.07955
About

This repository contains code of the H-Net architecture. Most of the code lies in hnet/, which has the following structure:
configs/
hnet/
├── models/ # Directory for H-Net
| ├── config_hnet.py (defines the config for the H-Net)
| ├── hnet.py (h-net as a (B, L, D) -> (B, L, D) sequence model)
│ └── mixer_seq.py (wrapper to turn h-net into a language model)
└── modules/ # Directory of model components
├── dc.py (modeling code for the dynamic chunking mechanism)
└── isotropic.py (code for isotropic, i.e. non-hierarchical components)
generate.py # Script for inference/generation
Installation
Requirements:
- PyTorch >= 2.5.1
Clone the repository and install package.
git clone https://github.com/goombalab/hnet
cd hnet
pip install -e .
We strongly recommend building mamba_ssm package from the latest source as follows:
git clone https://github.com/state-spaces/mamba
cd mamba
pip install .
Pretrained Models
Pretrained models are uploaded to
Hugging Face: hnet_1stage_L, hnet_2stage_L,
hnet_1stage_XL, hnet_2stage_XL.
We trained our models on the 100B-Token subset of FineWeb-Edu. <em>Large</em> and <em>XL</em> are compute-matched to GPT-3 <em>Large</em> and <em>XL</em>, respectively.
We also provide model weights for Chinese and Code, each trained using the 46B-Token subset of FineWeb-Edu Chinese V2.1 and Pile Github: hnet_2stage_XL_chinese, hnet_2stage_XL_code.
You can find specifics of these models at configs, and more details from the paper.
Text Generation
We provide generate.py for text generation that you can use with the pretrained checkpoints.
Examples
python generate.py --model-path [MODEL_CKPT] --config-path [CONFIG]
python generate.py --model-path hnet_2stage_XL.pt --config-path configs/hnet_2stage_XL.json --max-tokens 1024 --temperature 1.0 --top-p 1.0
Citation
If you use this codebase, or otherwise find our work valuable, please cite H-Net:
@article{hnet,
title={Dynamic Chunking for End-to-End Hierarchical Sequence Modeling},
author={Hwang, Sukjun and Wang, Brandon and Gu, Albert},
journal={arXiv preprint arXiv:2507.07955},
year={2025}
}
