C4

Generate Convert Improve

Install / Use

/learn @NanMuZ/C4

About this skill

Quality Score

0/100

README

Less is More: Clustered Cross-Covariance Control for Offline RL (C4)

📢 News

[Jan 2026] Our paper has been accepted at ICLR 2026. 🚀

📝 Abstract

C4 (Clustered Cross-Covariance Control) addresses the fundamental challenge of distributional shift in offline reinforcement learning. By identifying and mitigating harmful TD cross-covariance through partitioned buffer sampling and gradient-based corrective penalties, C4 significantly stabilizes value estimation. Our method demonstrates state-of-the-art performance across D4RL locomotion and kitchen tasks, achieving up to 30% improvement in returns over baseline methods.

🛠️ Installation

1) Clone and Install

git clone https://github.com/NanMuZ/C4.git
cd C4
pip install -e .

🚀 Quick Start

python run/C4.py

🙏 Acknowledgments

This repository is built upon OfflineRL-Kit: https://github.com/yihaosun1124/OfflineRL-Kit

We sincerely thank the authors for their clean and efficient framework.

C4 is released under the MIT License, consistent with the base repository.

📬 Contact

For questions or collaborations, please email: nanqiao.ai@gmail.com

📖 Citation

If you find this work useful, please consider citing:

@inproceedings{qiao2026less,
  title     = {Less is More: Clustered Cross-Covariance Control for Offline RL},
  author    = {Qiao, Nan and Yue, Sheng and Wang, Shuning and Deng, Yongheng and Ren, Ju},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year      = {2026}
}

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

research_rules

Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

NanMuZ

View profile

View on GitHub

GitHub Stars4

CategoryEducation

Updated29d ago

Forks2

NanMuZ/C4

Languages

Python

Security Score

70/100

Audited on Mar 3, 2026

No findings