RLSolver
Solvers for NP-hard and NP-complete problems with an emphasis on high-performance GPU computing.
Install / Use
/learn @AI4Finance-Foundation/RLSolverREADME
RLSolver: High-performance GPU-based Solvers for Nonconvex and NP-Complete Problems
New repo (We update RLSolver only in the new repo.)
RLSolver Contest 2025: docs website
We aim to showcase that reinforcement learning (RL) or machine learning (ML) with GPUs delivers the best benchmark performance for large-scale nonconvex and NP-complete problems. RL with the help of GPU computing can obtain high-quality solutions within short time.
Key Technologies
- RL/ML tricks such as learn to optimize and curriculum learning.
- OR tricks such as local search and tabu search.
- Massively parallel sampling of Markov chain Monte Carlo (MCMC) simulations on GPU using thousands of CUDA cores and tensor cores.
- Podracer scheduling on a GPU cloud such as DGX-2 SuperPod.
Key References
-
Mazyavkina, Nina, et al. "Reinforcement learning for combinatorial optimization: A survey." Computers & Operations Research 134 (2021): 105400.
-
Bengio, Yoshua, Andrea Lodi, and Antoine Prouvost. "Machine learning for combinatorial optimization: a methodological tour d’horizon." European Journal of Operational Research 290.2 (2021): 405-421.
-
Peng, Yun, Byron Choi, and Jianliang Xu. "Graph learning for combinatorial optimization: a survey of state-of-the-art." Data Science and Engineering 6, no. 2 (2021): 119-141.
-
Nair, Vinod, et al. "Solving mixed integer programs using neural networks." arXiv preprint arXiv:2012.13349 (2020).
-
Makoviychuk, Viktor, et al. "Isaac Gym: High performance GPU based physics simulation for robot learning." Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). 2021.
Workflow
<a target="\_blank"> <div align="center"> <img src=fig/work_flow.png width="60%"/> </div> </a>Datasets
-
Maxcut:
-
Gset is stored in the "data" folder of this repo. The number of nodes is from 800 to 10000.
-
Syn is the synthetic data obtained by calling the function generate_write in util.py. The number of nodes is from 10 to 50000. The (partial) synthetic data is stored in the "data" folder of this repo.
-
-
TSP: TSPLIB
Methods
- RL-based annealing using massively parallel enironments
2023 NeurIPS Classical Simulation of Quantum Circuits: Parallel Environments and Benchmark
2023 NeurIPS workshop K-Spin Ising Model for Combinatorial Optimizations over Graphs: A Reinforcement Learning Approach
2021 NeurIPS workshop ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
- Learning to branch
code 2023 AAAI Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories
code 2021 AAAI Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
- Learning to cut
code 2020 ICML Reinforcement learning for integer programming: Learning to cut
- RL-based heuristic
code (greedy) 2017 NeurIPS Learning Combinatorial Optimization Algorithms over Graphs
code (local search) 2023, A Monte Carlo Policy Gradient Method with Local Search for Binary Optimization
code (LKH for TSP) 2021 AAAI Combining reinforcement learning with Lin-Kernighan-Helsgaun algorithm for the traveling salesman problem
- Variational annealing
code (VCA_RNN) 2023 Machine_Learning Supplementing recurrent neural networks with annealing to solve combinatorial optimization problems
code (VNA) 2021 Nature Machine_Intelligence Variational neural annealing
- Discrete sampling
(iSCO) 2023 ICML Revisiting sampling for combinatorial optimization
- Classical methods
- Random walk
- Greedy
- Simulated annealing
- Local search
- Beam search
- Tabu search
- Branch-and-bound
- Cutting plane
Solvers to Compare with
Gurobi is the state-of-the-art solver. The license is required, and professors/students at universities can obtain the academic license for free.
SCIP is a well-known open-source solver, and its simplex is commonly used in "learning to branch/cut". SCIP is open-source and free.
Other Solvers
COPT: a mathematical optimization solver for large-scale problems.
CPLEX: a high-performance mathematical programming solver for linear programming, mixed integer programming, and quadratic programming.
Xpress: an extraordinarily powerful, field-installable Solver Engine.
BiqMac: a solver only for binary quadratic or maxcut. Users should upload txt file, but the response time is not guaranteed. If users use it, we recommend to download the sources and run it by local computers.
Store Results
Partial results are stored in the folder "result" of this repo.
Performance
Quantum circuits MIMO Compressive sensing
File Structure
RLSolver
└──helloworld
└──maxcut
└──data
└──result
└──util.py
└──mcmc.py
└──l2a.py (ours)
└──baseline
└──greedy.py
└──gurobi.py
└──random_walk.py
└──simulated_annealing.py
└──variational_classical_annealing_RNN
└──variational_neural_annealing
└──benchmark
└──maxcut.md
└──graph_partitioning.md
└──tsp.md
└──tnco.md
└──rlsolver (main folder)
└──util.py
└──data
└──graph
└──quantum_circuits
└──milp_coefs
└──binary_coefs
└──problems
└──maxcut
└──baseline
└──mcmc.py
└──l2a.py(ours)
└──tnco
└──baseline
└──mcmc.py
└──l2a.py(ours)
└──mimo
└──baseline
└──mcmc.py
└──l2a.py(ours)
Finished
- [x] MIMO
- [x] Maxcut
- [x] TNCO
- [x] quantum circuits
TODO
- [ ] TSP
- [ ] VRP (Vehicle routing problem)
- [ ] Graph partitioning
- [ ] Minimum vertex cover
- [ ] MILP
- [ ] Portfolio allocation
Related Websites
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
10.3kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
