13 skills found
OpenRLHF / OpenRLHFAn Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
TideDra / Lmm R1Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
TsinghuaC3I / MARTIA Framework for LLM-based Multi-Agent Reinforced Training and Inference
OpenRLHF / OpenRLHF MAn Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
yyht / Openrlhf Async PiplineNo description available
DeepGym / DeepgymRL training environments with verifiable rewards for coding agents. Works with TRL, Unsloth, verl, OpenRLHF.
rosieyzh / Openrlhf PretrainCode for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"
sjelassi / Ebft OpenrlhfCode for "Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models".
Freder-chen / OpenRLHF AgentNo description available
victorShawFan / OpenRLHF Add Simpo添加了simpo方法的OpenRLHF,个人修改,原仓库链接:https://github.com/OpenLLMAI/OpenRLHF
LLM4AIOps / OpenRLHF ThinkFLNo description available
Magnicord / Llm Env TemplatesA list of uv environments templates for LLM development.
OpenRLHF / OpenRLHF DocsNo description available