Ytopt
ytopt: machine-learning-based autotuning and hyperparameter optimization framework using Bayesian Optimization
Install / Use
/learn @ytopt-team/YtoptREADME
What is ytopt?
ytopt is a machine learning-based autotuning and hyperparameter optimization software package in python that uses Bayesian Optimization to find the best input parameter/hyperparameter configurations for a given kernel, miniapp, or application with the best system configurations for a given HPC system.
ytopt accepts the following as input:
- A code-evaluation wrapper with tunable parameters as a code mold for performance measurement
- Tunable application parameters (hyperparameters) and tunable system parameters
- The corresponding parameter search space for the tunable parameters
By sampling and evaluating a small number of input configurations, ytopt gradually builds a surrogate model of the input-output space. This process continues until the user-specified time or the maximum number of evaluations is reached.
ytopt handles both unconstrained and constrained optimization problems, searches and evaluates asynchronously, and can look-ahead on iterations to more effectively adapt to new evaluations and adjust the search towards promising configurations, leading to a more efficient and faster convergence on the best solutions.
Internally, ytopt uses a manager-worker computational paradigm, where one node fits the surrogate model and generates new input configurations, and other nodes perform the computationally expensive evaluations and return the results to the manager node. This is implemented in two ways: using ray for ytopt/benchmark in sequential processing and using libensemble for ytopt-libe in parallel processing. ray limits trial directory / file name length to 107 bytes (AF_UNIX path length cannot exceed), libensemble copes with this issue.
Therefore,ytopt-libe is encouraged to use.
Additional documentation is available on Read the Docs. Access ytopt-libe for the latest examples with new features and development.
Installation instructions
ytopt requires the following components: dh-scikit-optimize, autotune, and ConfigSpace. When ytopt is being installed, ConfigSpace and LibEnsemble are required to be installed automatically.
- We recommend creating isolated Python environments on your local machine using conda with python version >=3.10, for example:
conda create --name ytune python=3.13
conda activate ytune
- Create a directory for
ytune:
mkdir ytune
cd ytune
- Install dh-scikit-optimize:
git clone https://github.com/ytopt-team/scikit-optimize.git
cd scikit-optimize
pip install -e .
cd ..
- Install autotune:
git clone -b version1 https://github.com/ytopt-team/autotune.git
cd autotune
pip install -e .
cd ..
- Install ytopt:
git clone -b main https://github.com/ytopt-team/ytopt.git
cd ytopt
pip install -e .
After installing scikit-optimize, autotune, and ytopt successfully, the autotuning framework ytopt is ready to use. Browse the ytopt/benchmark directory for an extensive collection of old examples, or encourage to access ytopt-libe for the latest examples with new features.
Directory structure
docs/
Sphinx documentation files
test/
scipts for running benchmark problems in the problems directory
ytopt/
scripts that contain the search implementations
ytopt/hpo/
Hyperparameter optimization with 7 and 17 hyperparameters using ray
ytopt/benchmark/
a set of problems the user can use to compare our different search algorithms or as examples to build their own problems
ytopt/Benchmarks/
a set of problems for autotuning PolyBench 4.2 and ECP proxy apps
ytopt-libe/
scripts and a set of examples for using ytopt-libe with new features
ytopt-libe/hpo/
Hyperparameter optimization with 7 and 17 hyperparameters using libensemble
Basic Usage
ytoptis typically run from the command-line in the following example manner:
python -m ytopt.search.ambs --evaluator ray --problem problem.Problem --max-evals=10 --learner RF
Where:
- The search variant is one of
ambs(Asynchronous Model-Based Search) orasync_search(run as an MPI process). - The evaluator is the method of concurrent evaluations, and can be
rayorsubprocess. - The problem is typically an
autotune.TuningProbleminstance. Specify the module path and instance name. --max-evalsis the maximum number of evaluations.
Depending on the search variant chosen, other command-line options may be provided. For example, the ytopt.search.ambs search
method above was further customized by specifying the RF learning strategy.
See the autotune docs for basic information on getting started with creating a TuningProblem instance.
See the ConfigSpace docs for guidance on defining input/output parameter spaces for problems.
Otherwise, access the subdirectory ytopt-libe for the latest examples with new features.
ytopt-libeis typically run from the command-line in the following example manner:
python run_ytopt.py --comms local --nworkers 3 --max-evals=10 --learner RF
Where:
run_ytopt.pydefines the parameter space, then runs libEnsemble to call the ytopt ask/tell interface in a generator function, and the ytopt findRunTime interface in a simulator function.--nworkersis the number of workers (master+workers) to be created to run the evaluations in parallel.--commsis the communication type.
ytopt-libesupports both the old and the new formats inConfigSpaceto define the search space as follows:
The old format (ConfigSpace 0.71 or lower):
import ConfigSpace as CS
import ConfigSpace.hyperparameters as CSH
cs = CS.ConfigurationSpace(seed=1234)
p0 = CSH.UniformFloatHyperparameter(name='p0', lower=0.00001, upper=0.1, default_value=0.001)
p1 = CSH.UniformIntegerHyperparameter(name='p1', lower=1, upper=50, default_value=10)
p2 = CSH.CategoricalHyperparameter(name='p2', choices=['rmsprop', 'adam', 'sgd'], default_value='rmsprop')
cs.add_hyperparameters([p0, p1, p2])
The new format (ConfigSpace 1.0 or higher):
from ConfigSpace import ConfigurationSpace, Categorical, Float, Integer
cs = ConfigurationSpace(seed=1234)
p0 = Float('p0', bounds=(0.00001, 0.1), default=0.001)
p1 = Integer('p1', bounds=(1, 50), default=10)
p2 = Categorical('p2', ['rmsprop', 'adam', 'sgd'], default='rmsprop')
cs.add([p0, p1, p2])
Although the old format in ConfigSpace supports the quantization factor q,
the new format does not support it anymore.
Tutorials
- Autotuning the block matrix multiplication
- Autotuning the OpenMP version of XSBench
- Autotuning the OpenMP version of XSBench with constraints
- Autotuning the hybrid MPI/OpenMP version of XSBench
- Autotuning the hybrid MPI/OpenMP version of XSBench with constraints
- Autotuning the OpenMP version of convolution-2d with constraints
- (Optinal) Autotuning the OpenMP version of XSBench online
Who is responsible?
The core ytopt team is at Argonne National Laboratory:
- Xingfu Wu xingfu.wu@anl.gov
- Prasanna Balaprakash pbalapra@anl.gov
- Brice Videau bvideau@anl.gov
- Paul Hovland hovland@anl.gov
- Romain Egele regele@anl.gov
- Jaehoon Koo jkoo@anl.gov
Publications
- M. A. Hossain, X. Wu, V. Taylor, A. Jannesari, Generalizing Scaling Laws for Dense and Sparse Large Language Models, IPDPS2026 Workshop on HPC for AI Foundation Models & LLMs for Science (HPAI4S'26), May 26, 2026, New Orleans, USA. DOI: 10.48550/arXiv.2508.06617.
- X. Wu, P. Balaprakash, M. Kruse, J. Koo, B. Videau, P. Hovland, V. Taylor, B. Geltz, S. Jana, and M. Hall, "ytopt: Autotuning Scientific Applications for Energy Efficiency at Large S
