SkillAgentSearch skills...

C4

C4 (Clustered Cross-Covariance Control) addresses the fundamental challenge of distributional shift in offline reinforcement learning. By identifying and mitigating harmful TD cross-covariance through partitioned buffer sampling and gradient-based corrective penalties.

Install / Use

/learn @NanMuZ/C4
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Less is More: Clustered Cross-Covariance Control for Offline RL (C4)

License: MIT Conference Python PyTorch


📢 News

  • [Jan 2026] Our paper has been accepted at ICLR 2026. 🚀

📝 Abstract

C4 (Clustered Cross-Covariance Control) addresses the fundamental challenge of distributional shift in offline reinforcement learning. By identifying and mitigating harmful TD cross-covariance through partitioned buffer sampling and gradient-based corrective penalties, C4 significantly stabilizes value estimation. Our method demonstrates state-of-the-art performance across D4RL locomotion and kitchen tasks, achieving up to 30% improvement in returns over baseline methods.


🛠️ Installation

1) Clone and Install

git clone https://github.com/NanMuZ/C4.git
cd C4
pip install -e .

🚀 Quick Start

python run/C4.py 

🙏 Acknowledgments

This repository is built upon OfflineRL-Kit: https://github.com/yihaosun1124/OfflineRL-Kit

We sincerely thank the authors for their clean and efficient framework.

C4 is released under the MIT License, consistent with the base repository.


📬 Contact

For questions or collaborations, please email: nanqiao.ai@gmail.com


📖 Citation

If you find this work useful, please consider citing:

@inproceedings{qiao2026less,
  title     = {Less is More: Clustered Cross-Covariance Control for Offline RL},
  author    = {Qiao, Nan and Yue, Sheng and Wang, Shuning and Deng, Yongheng and Ren, Ju},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year      = {2026}
}

Related Skills

View on GitHub
GitHub Stars4
CategoryEducation
Updated29d ago
Forks2

Languages

Python

Security Score

70/100

Audited on Mar 3, 2026

No findings