C4
C4 (Clustered Cross-Covariance Control) addresses the fundamental challenge of distributional shift in offline reinforcement learning. By identifying and mitigating harmful TD cross-covariance through partitioned buffer sampling and gradient-based corrective penalties.
Install / Use
/learn @NanMuZ/C4README
Less is More: Clustered Cross-Covariance Control for Offline RL (C4)
📢 News
- [Jan 2026] Our paper has been accepted at ICLR 2026. 🚀
📝 Abstract
C4 (Clustered Cross-Covariance Control) addresses the fundamental challenge of distributional shift in offline reinforcement learning. By identifying and mitigating harmful TD cross-covariance through partitioned buffer sampling and gradient-based corrective penalties, C4 significantly stabilizes value estimation. Our method demonstrates state-of-the-art performance across D4RL locomotion and kitchen tasks, achieving up to 30% improvement in returns over baseline methods.
🛠️ Installation
1) Clone and Install
git clone https://github.com/NanMuZ/C4.git
cd C4
pip install -e .
🚀 Quick Start
python run/C4.py
🙏 Acknowledgments
This repository is built upon OfflineRL-Kit: https://github.com/yihaosun1124/OfflineRL-Kit
We sincerely thank the authors for their clean and efficient framework.
C4 is released under the MIT License, consistent with the base repository.
📬 Contact
For questions or collaborations, please email: nanqiao.ai@gmail.com
📖 Citation
If you find this work useful, please consider citing:
@inproceedings{qiao2026less,
title = {Less is More: Clustered Cross-Covariance Control for Offline RL},
author = {Qiao, Nan and Yue, Sheng and Wang, Shuning and Deng, Yongheng and Ren, Ju},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2026}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
