Bayesmark

Benchmark framework to easily compare Bayesian optimization methods on real machine learning tasks

Generate Convert Improve

Install / Use

/learn @uber/Bayesmark

About this skill

Quality Score

0/100

README

Installation

This project provides a benchmark framework to easily compare Bayesian optimization methods on real machine learning tasks.

This project is experimental and the APIs are not considered stable.

This Bayesian optimization (BO) benchmark framework requires a few easy steps for setup. It can be run either on a local machine (in serial) or prepare a commands file to run on a cluster as parallel experiments (dry run mode).

Only Python>=3.6 is officially supported, but older versions of Python likely work as well.

The core package itself can be installed with:

.. code-block:: bash

pip install bayesmark

However, to also require installation of all the "built in" optimizers for evaluation, run:

.. code-block:: bash

pip install bayesmark[optimizers]

It is also possible to use the same pinned dependencies we used in testing by installing from the repo <#install-in-editable-mode>_.

Building an environment to run the included notebooks can be done with:

.. code-block:: bash

pip install bayesmark[notebooks]

Or, bayesmark[optimizers,notebooks] can be used.

A quick example of running the benchmark is here <#example>_. The instructions are used to generate results as below:

.. image:: https://user-images.githubusercontent.com/28273671/66338456-02516b80-e8f6-11e9-8156-2e84e04cf6fe.png :width: 95 %

Non-pip dependencies

To be able to install opentuner some system level (non-pip) dependencies must be installed. This can be done with:

.. code-block:: bash

sudo apt-get install libsqlite3-0 sudo apt-get install libsqlite3-dev

On Ubuntu, this results in:

.. code-block:: console

dpkg -l | grep libsqlite ii libsqlite3-0:amd64 3.11.0-1ubuntu1 amd64 SQLite 3 shared library ii libsqlite3-dev:amd64 3.11.0-1ubuntu1 amd64 SQLite 3 development files

The environment should now all be setup to run the BO benchmark.

Running

Now we can run each step of the experiments. First, we run all combinations and then run some quick commands to analyze the output.

Launch the experiments

The experiments are run using the experiment launcher, which has the following interface:

.. code-block::

usage: bayesmark-launch [-h] [-dir DB_ROOT] [-odir OPTIMIZER_ROOT] [-v] [-u UUID] [-dr DATA_ROOT] [-b DB] [-o OPTIMIZER [OPTIMIZER ...]] [-d DATA [DATA ...]] [-c [{DT,MLP-adam,MLP-sgd,RF,SVM,ada,kNN,lasso,linear} ...]] [-m [{acc,mae,mse,nll} ...]] [-n N_CALLS] [-p N_SUGGEST] [-r N_REPEAT] [-nj N_JOBS] [-ofile JOBS_FILE]

The arguments are:

.. code-block::

 -h, --help            show this help message and exit
 -dir DB_ROOT, -db-root DB_ROOT
                       root directory for all benchmark experiments output
 -odir OPTIMIZER_ROOT, --opt-root OPTIMIZER_ROOT
                       Directory with optimization wrappers
 -v, --verbose         print the study logs to console
 -u UUID, --uuid UUID  length 32 hex UUID for this experiment
 -dr DATA_ROOT, --data-root DATA_ROOT
                       root directory for all custom csv files
 -b DB, --db DB        database ID of this benchmark experiment
 -o OPTIMIZER [OPTIMIZER ...], --opt OPTIMIZER [OPTIMIZER ...]
                       optimizers to use
 -d DATA [DATA ...], --data DATA [DATA ...]
                       data sets to use
 -c, --classifier [{DT,MLP-adam,MLP-sgd,RF,SVM,ada,kNN,lasso,linear} ...]
                       classifiers to use
 -m, --metric [{acc,mae,mse,nll} ...]
                       scoring metric to use
 -n N_CALLS, --calls N_CALLS
                       number of function evaluations
 -p N_SUGGEST, --suggestions N_SUGGEST
                       number of suggestions to provide in parallel
 -r N_REPEAT, --repeat N_REPEAT
                       number of repetitions of each study
 -nj N_JOBS, --num-jobs N_JOBS
                       number of jobs to put in the dry run file, the default
                       0 value disables dry run (real run)
 -ofile JOBS_FILE, --jobs-file JOBS_FILE
                       a jobs file with all commands to be run

The output files will be placed in [DB_ROOT]/[DBID]. If DBID is not specified, it will be a randomly created subdirectory with a new name to avoid overwriting previous experiments. The path to DBID is shown at the beginning of stdout when running bayesmark-launch. In general, let the launcher create and setup DBID unless you are appending to a previous experiment, in which case, specify the existing DBID.

The launcher's sequence of commands can be accessed programmatically via :func:.experiment_launcher.gen_commands. The individual experiments can be launched programmatically via :func:.experiment.run_sklearn_study.

Selecting the experiments ^^^^^^^^^^^^^^^^^^^^^^^^^

A list of optimizers, classifiers, data sets, and metrics can be listed using the -o/-c/-d/-m commands, respectively. If not specified, the program launches all possible options.

Selecting the optimizer ^^^^^^^^^^^^^^^^^^^^^^^

A few different open source optimizers have been included as an example and are considered the "built-in" optimizers. The original repos are shown in the Links <#links>_.

The data argument -o allows a list containing the "built-in" optimizers:

.. code-block::

"HyperOpt", "Nevergrad-OnePlusOne", "OpenTuner-BanditA", "OpenTuner-GA", "OpenTuner-GA-DE", "PySOT", "RandomSearch", "Scikit-GBRT-Hedge", "Scikit-GP-Hedge", "Scikit-GP-LCB"

or, one can specify a user-defined optimizer. The class containing an optimizer conforming to the API must be found in in the folder specified by --opt-root. Additionally, a configuration defining each optimizer must be defined in [OPT_ROOT]/config.json. The --opt-root and config.json may be omitted if only built-in optimizers are used.

Additional details for providing a new optimizer are found in adding a new optimizer <#adding-a-new-optimizer>_.

Selecting the data set ^^^^^^^^^^^^^^^^^^^^^^

By default, this benchmark uses the sklearn example data sets <https://scikit-learn.org/stable/datasets/index.html#toy-datasets>_ as the "built-in" data sets for use in ML model tuning problems.

The data argument -d allows a list containing the "built-in" data sets:

.. code-block::

"breast", "digits", "iris", "wine", "boston", "diabetes"

or, it can refer to a custom csv file, which is the name of file in the folder specified by --data-root. It also follows the convention that regression data sets start with reg- and classification data sets start with clf-. For example, the classification data set in [DATA_ROOT]/clf-foo.csv is specified with -d clf-foo.

The csv file can be anything readable by pandas, but we assume the final column is the target and all other columns are features. The target column should be integer for classification data and float for regression. The features should float (or str for categorical variable columns). See bayesmark.data.load_data for more information.

Dry run for cluster jobs ^^^^^^^^^^^^^^^^^^^^^^^^

It is also possible to do a "dry run" of the launcher by specifying a value for --num-jobs greater than zero. For example, if --num-jobs 50 is provided, a text file listing 50 commands to run is produced, with one command (job) per line. This is useful when preparing a list of commands to run later on a cluster.

A dry run will generate a command file (e.g., jobs.txt) like the following (with a meta-data header). Each line corresponds to a command that can be used as a job on a different worker:

.. code-block::

running: {'--uuid': None, '-db-root': '/foo', '--opt-root': '/example_opt_root', '--data-root': None, '--db': 'bo_example_folder', '--opt': ['RandomSearch', 'PySOT'], '--data': None, '--classifier': ['SVM', 'DT'], '--metric': None, '--calls': 15, '--suggestions': 1, '--repeat': 3, '--num-jobs': 50, '--jobs-file': '/jobs.txt', '--verbose': False, 'dry_run': True, 'rev': '9a14ef2', 'opt_rev': None}

cmd: python bayesmark-launch -n 15 -r 3 -dir foo -o RandomSearch PySOT -c SVM DT -nj 50 -b bo_example_folder

job_e2b63a9_00 bayesmark-exp -c SVM -d diabetes -o PySOT -u 079a155f03095d2ba414a5d2cedde08c -m mse -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c SVM -d boston -o RandomSearch -u 400e4c0be8295ad59db22d9b5f31d153 -m mse -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c SVM -d digits -o RandomSearch -u fe73a2aa960a5e3f8d78bfc4bcf51428 -m acc -n 15 -p 1 -dir foo -b bo_example_folder job_e2b63a9_01 bayesmark-exp -c DT -d diabetes -o PySOT -u db1d9297948554e096006c172a0486fb -m mse -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c SVM -d boston -o RandomSearch -u 7148f690ed6a543890639cc59db8320b -m mse -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c SVM -d breast -o PySOT -u 72c104ba1b6d5bb8a546b0064a7c52b1 -m nll -n 15 -p 1 -dir foo -b bo_example_folder job_e2b63a9_02 bayesmark-exp -c SVM -d iris -o PySOT -u cc63b2c1e4315a9aac0f5f7b496bfb0f -m nll -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c DT -d breast -o RandomSearch -u aec62e1c8b5552e6b12836f0c59c1681 -m nll -n 15 -p 1 -dir foo -b bo_example_folder && bayesmark-exp -c DT -d digits -o RandomSearch -u 4d0a175d56105b6bb3055c3b62937b2d -m acc -n 15 -p 1 -dir foo -b bo_example_folder ...

This package does not have built in support for deploying these jobs on a cluster or cloud environment (.e.g., AWS).

The UUID argument ^^^^^^^^^^^^^^^^^

The UUID is a 32-char hex string used as a master random seed which we use to draw random seeds for the experiments. If UUID is not specified a version 4 UUID is generated. The used UUID is displayed at the begi

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding