<div id="top"></div>
<!--
*** Thanks for checking out the Best-README-Template. If you have a suggestion
*** that would make this better, please fork the repo and create a pull request
*** or simply open an issue with the tag "enhancement".
*** Don't forget to give the project a star!
*** Thanks again! Now go create something AMAZING! :D
-->
<!-- PROJECT SHIELDS -->
<!--
*** I'm using markdown "reference style" links for readability.
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
*** See the bottom of this document for the declaration of the reference variables
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
*** https://www.markdownguide.org/basic-syntax/#reference-style-links
-->
<!--
***[![MIT License][license-shield]][license-url]
-->
<!-- PROJECT LOGO -->
<br />
<div align="center">
<a href="https://github.com/openreasoner/openr/">
<img src="figure/openr_logo.png" alt="Logo" width="200">
</a>
<h1 align="center" style="font-size: 30px;"><strong><em>OpenR</em></strong>: An Open Source Framework for Advanced Reasoning with Large Language Models</h1>
<p align="center">
<a href="https://arxiv.org/abs/2410.09671">Paper</a>
·
<a href="https://github.com/openreasoner/openr/blob/main/reports/Tutorial-LLM-Reasoning-Wang.pdf">Tutorial</a>
·
<a href="https://github.com/openreasoner/openr">Code</a>
·
<a href="https://openreasoner.github.io/">Docs</a>
·
<a href="https://huggingface.co/datasets/openreasoner/MATH-APS">Data</a>
·
<a href="https://huggingface.co/openreasoner/Math-psa">Model</a>
·
<a href="https://github.com/openreasoner/openr/issues">Issue</a>
·
<a href="https://www.modelscope.cn/studios/modelscope/OpenR_Inference">Demo</a>
</p>
<p align="center">
[ <a href="https://github.com/openreasoner/openr/blob/main/README.md">English</a> ][ <a href="https://github.com/openreasoner/openr/blob/main/README_zh.md">中文</a> ]
</p>
</div>
[
][contributors-url]
[
][issues-url]
[
][forks-url]
[
][stars-url]

<!-- TABLE OF CONTENTS -->
<details>
<summary><span style="font-size: 1.5em;"><strong>Table of Contents</strong> 📖 </span></summary>
<ol>
<li><a href="#news-and-updates">News and Updates</a></li>
<li><a href="#features">Features</a></li>
<li><a href="#todo">TODO</a></li>
<li><a href="#benchmark">Benchmark</a></li>
<li><a href="#plots">Plots</a></li>
<li><a href="#provided-datasets-and-models">Datasets and Models</a></li>
<li>
<a href="#getting-started">Getting Started</a>
<ul>
<li><a href="#installation">Installation</a></li>
<li><a href="#quickstart">Quick Start</a></li>
</ul>
</li>
<li><a href="#usage">Usage</a></li>
<li><a href="#join-us">Join Us</a></li>
<li><a href="#contact">Contact</a></li>
<li><a href="#response-examples">Response Examples</a></li>
<li><a href="#community">Community</a></li>
<li><a href="#reference">Reference</a></li>
</ol>
</details>
<!-- News and Updates -->
News and Updates
- [29/11/2024] We have now added a demo page on ModelScope. Many thanks to @wangxingjun778 !
- [24/10/2024] OpenR now supports MCTS reasoning (#24)! 🌲
- [15/10/2024] Our report is on Arxiv!
- [12/10/2024] OpenR has been released! 🚀
Features
<p align="center">
<img src="./figure/logo_text.png" alt="Description" style="width: 300px; margin-left: 50px; float: right;">
</p>
| Feature | Contents |
|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ✅ Process-supervision Data Generation | - OmegaPRM: Improve Mathematical Reasoning in Language Models by Automated Process Supervision |
| ✅ Online Policy Training | - RL Training: APPO, GRPO, TPPO; |
| ✅ Generative and Discriminative PRM Training | - PRM Training: Supervised Training for PRMs<br> - Generative RM Training: Direct GenRM |
| ✅ Multiple Search Strategies | - Greedy Search<br> - Best-of-N<br> - Beam Search<br> - MCTS<br> - rStar: Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers<br> - Critic-MCTS: Under Review |
| ✅ Test-time Computation and Scaling Law | TBA, see benchmark |
TODO
| Feature | TODO (<span style="color:red;">High Priority</span>, We value you contribution!) |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 👨💻Data | - Re-implement Journey Learning |
| 👨💻RL Training | - Distributed Training<br/>- Reinforcement Fine-Tuning (RFT) #80 |
| 👨💻PRM | - Larger-scale training<br> - GenRM-CoT implementation <br/>- Soft-label training #57 |
| 👨💻Reasoning | - Optimize code structure #53 <br> - More tasks on reasoning (AIME, etc.) #53 <br> - Multi-modal reasoning #82 <br> - Reasoning in code generation #68 <br/> - Dots #75 <br/> - Consistency check <br/> - Benchmarking |
Benchmark
See Benchmark !
Plots
<p align