Benchmarks for Optuna

Interested in measuring Optuna's performance? You are very perceptive. Under this directory, you will find scripts that we have prepared to measure Optuna's performance.

In this document, we explain how we measure the performance of Optuna using the scripts in this directory. The contents of this document are organized as follows.

Performance Benchmarks with `kurobako`

We measure the performance of black-box optimization algorithms in Optuna with kurobako using benchmarks/run_kurobako.py. You can manually run this script on the GitHub Actions if you have a write access on the repository. Or, you can locally execute the benchmarks/run_kurobako.py. We explain both of method here.

How to Run on the GitHub Actions

You need a write access on the repository. Please run the following steps in your own forks. Note that you should pull the latest master branch of Optuna since the workflow YAML file must be placed in the default branch of the repository.

Open the GitHub page of your forked Optuna repository.
Click the Actions below the repository name.
In the left sidebar, click the Performance Benchmarks with kurobako.
Above the list of workflow runs, select Run workflow.
Use the Branch dropdown to select the workflow's branch. The default is master. And, type the input parameters: Sampler List, Sampler Arguments List, Pruner List, and Pruner Arguments List.
Click Run workflow.
After finishing the workflow, you can download the report and plot from Artifacts. The report looks like as follows. It includes the version information of environments, the solvers (pairs of the sampler and the pruner in Optuna) and problems, the best objective value, AUC, elapsed time, and so on. The plot looks like as follows. It represents the optimization history plot of the optimization. The title is the name of the problem. The legends represents the specified pair of the sampler and the pruner. The history is averaged over the specified n_runs studies with the errorbar. The horizontal axis represents the budget (#budgets * #epochs = \sum_{for each trial) (#consumed epochs in the trial)). The vertical axis represents the objective value.

Note that the default run time of a GitHub Actions workflow job is limited to 6 hours. Depending on the sampler and number of studies you specify, it may exceed the 6-hour limit and fail. See the official document for more details.

How to Run Locally

You can run the script of benchmarks/run_kurobako.py directly. This section explains how to locally run it.

First, you need to install kurobako and its Python helper. To install kurobako, see https://github.com/optuna/kurobako#installation for more details. In addition, please run pip install kurobako to install the Python helper. You need to install gnuplot for visualization with kurobako. You can install gnuplot by package managers such as apt (for Ubuntu) or brew (for macOS).

Second, you need to download the dataset for kurobako. Run the followings in the dataset directory.

# Download hyperparameter optimization (HPO) dataset
% wget http://ml4aad.org/wp-content/uploads/2019/01/fcnet_tabular_benchmarks.tar.gz
% tar xf fcnet_tabular_benchmarks.tar.gz

# Download neural architecture search (NAS) dataset
# The `kurobako` command should be available.
% curl -L $(kurobako dataset nasbench url) -o nasbench_full.tfrecord
% kurobako dataset nasbench convert nasbench_full.tfrecord nasbench_full.bin

Finally, you can run the script of benchmarks/run_kurobako.py.

% python benchmarks/run_kurobako.py \
          --path-to-kurobako "" \ # If the `kurobako` command is available.
          --name "performance-benchmarks" \
          --n-runs 10 \
          --n-jobs 10 \
          --sampler-list "RandomSampler TPESampler" \
          --sampler-kwargs-list "{} {}" \
          --pruner-list "NopPruner" \
          --pruner-kwargs-list "{}" \
          --seed 0 \
          --data-dir "." \
          --out-dir "out"

Please see benchmarks/run_kurobako.py to check the arguments and those default values.

Multi-objective support

We also have benchmarks for multi-objective optimization in kurobako. Note that we do not have pruner support for multi-objective optimization yet.

Multi-objective benchmarks can also be run from GitHub Actions or locally. To run it from GitHub Actions, please click Performance Benchmarks with mo-kurobako in Step 3.

To run it locally, please run benchmarks/run_mo_kurobako.py.

% python benchmarks/run_mo_kurobako.py \
          --path-to-kurobako "" \ # If the `kurobako` command is available.
          --name "performance-benchmarks" \
          --n-runs 10 \
          --n-jobs 10 \
          --sampler-list "RandomSampler TPESampler  NSGAIISampler" \
          --sampler-kwargs-list "{} {\"multivariate\":true,\"constant_liar\":true} {\"population_size\":20}" \
          --seed 0 \
          --data-dir "." \
          --out-dir "out"

Performance benchmarks with `bayesmark`

This workflow allows to benchmark optimization algorithms available in Optuna with bayesmark. This is done by repeatedly performing hyperparameter search on set of scikit-learn models fitted to a list of toy datasets and aggregating the results. Those are then compared to baseline provided by random sampler. This benchmark can be run with GitHub Actions or locally.

How to run on the GitHub Actions

Follow points 1 and 2 from Performance Benchmarks with kurobako
In the left sidebar, click the Performance benchmarks with bayesmark
Above the list of workflow runs, select Run workflow.
Here you can select branch to run benchmark from, as well as parameters. Click Run workflow to start the benchmark run.

After finishing the workflow, you can download the report and plots from Artifacts.

benchmark-report contains markdown file with solver leaderboards for each problem. Basic information on benchmark setup is also available.

benchmark-plots is a set of optimization history plots for each solved problem. Similarly to kurobako, each plot shows objective value as a function of finished trials. For each problem, average and median taken over n_runs is shown. If Include warm-up steps in plots checkbox was not selected in workflow config, first 10 trials will be excluded from visualizations.

See this doc for more information on bayesmark scoring.

How to run locally

CI runs benchmarks on all model/dataset combination in parallel, hovever running benchmark on single problem locally is possoble. To do this, first install required Python packages.

pip install bayesmark matplotlib numpy scipy pandas Jinja2

Benchmark run can be started with

% python benchmarks/run_bayesmark.py \
          --dataset iris \
          --model kNN \
          --budget 80 \
          --repeat 10 \
          --sampler-list "TPESampler CmaEsSampler" \
          --sampler-kwargs-list "{\"multivariate\":true,\"constant_liar\":true} {}" \
          --pruner-list "NopPruner" \
          --pruner-kwargs-list "{}"

Allowed models are [kNN, SVM, DT, RF, MLP-sgd, ada, linear] and allowed datasets are [breast, digits, iris, wine, diabetes]. For more details on default parameters please refer to benchmarks/run_bayesmark.py. Markdown report can be generated after benchmark has been completed by running

% python benchmarks/bayesmark/report_bayesmark.py

You'll find benchmark artifacts in plots and report directories.

Performance Benchmarks with `NASLib`

This workflow allows to benchmark optimization algorithms available in Optuna with NASLib. NASLib has an abstraction over a number of NAS benchmarks. Currently only NAS-Bench-201 is supported. This benchmark can be run on GitHub Actions or locally.

How to run on the GitHub Actions

Please follow the same steps as in Performance Benchmarks with kurobako, except that you need to select Performance benchmarks with NASLib in step 3.

How to Run Locally

In order to run NASLib benchmarks, you need the following dependencies:

NASLib and necessary data files (Currently, nb201_cifar10_full_training.pickle, nb201_cifar100_full_training.pickle and nb201_ImageNet16_full_training.pickle are needed.)
kurobako
kurobako-py
gnuplot

Please see each page for the detailed instructions. In short, NASLib can be installed by cloning the NASLib, downloading all the data files under NASLib/naslib/data/ repository from GitHub, and running

$ pip3 install -e .

You also need to set up kurobako command in the same way as we have described. After this, kurobako-py can be installed with

$ pip3 install kurobako

Finally, you can run the script of benchmarks/run_naslib.py.

$ python3 benchmarks/run_naslib.py \
            --path-to-kurobako "" \
            --name "performance-benchmarks" \
            --n-runs 10 \
            --n-jobs 10 \
            --sampler-list "RandomSampler TPESampler" \
            --sampler-kwargs-list "{} {}" \
            --pruner-list "NopPruner" \
            --pruner-kwargs-list "{}" \
            --seed 0 \
            --out-dir "out"

Please see benchmarks/run_naslib.py to check the arguments and those default values.

optuna

HIHideaki ImamuraMerge pull request #5344 from nzw0301/reduce-test-warnings6 месяцев назад26799c

Benchmarks for Optuna

Performance Benchmarks with kurobako

How to Run on the GitHub Actions

How to Run Locally

Multi-objective support

Performance benchmarks with bayesmark

How to run on the GitHub Actions

How to run locally

Performance Benchmarks with NASLib

How to run on the GitHub Actions

How to Run Locally

Использование cookies

Performance Benchmarks with `kurobako`

Performance benchmarks with `bayesmark`

Performance Benchmarks with `NASLib`