SkillAgentSearch skills...

Robustbench

RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]

Install / Use

/learn @RobustBench/Robustbench
About this skill

Quality Score

0/100

Supported Platforms

Zed

README

RobustBench: a standardized adversarial robustness benchmark

Francesco Croce* (University of Tübingen), Maksym Andriushchenko* (EPFL), Vikash Sehwag* (Princeton University), Edoardo Debenedetti* (EPFL), Nicolas Flammarion (EPFL), Mung Chiang (Purdue University), Prateek Mittal (Princeton University), Matthias Hein (University of Tübingen)

Leaderboard: https://robustbench.github.io/

Paper: https://arxiv.org/abs/2010.09670

❗Note❗: if you experience problems with the automatic downloading of the models from Google Drive, install the latest version of RobustBench via pip install git+https://github.com/RobustBench/robustbench.git.

<p align="center"><img src="images/leaderboard_screenshot_linf.png" width="700"> <p align="center"><img src="images/leaderboard_screenshot_l2.png" width="700"> <p align="center"><img src="images/leaderboard_screenshot_corruptions.png" width="700">

News

  • May 2022: We have extended the common corruptions leaderboard on ImageNet with 3D Common Corruptions (ImageNet-3DCC). ImageNet-3DCC evaluation is interesting since (1) it includes more realistic corruptions and (2) it can be used to assess generalization of the existing models which may have overfitted to ImageNet-C. For a quickstart, click here. Note that the entries in leaderboard are still sorted according to ImageNet-C performance.

  • May 2022: We fixed the preprocessing issue for ImageNet corruption evaluations: previously we used resize to 256x256 and central crop to 224x224 which wasn't necessary since the ImageNet-C images are already 224x224 (see this issue). Note that this changed the ranking between the top-1 and top-2 entries.

Main idea

The goal of RobustBench is to systematically track the real progress in adversarial robustness. There are already more than 3'000 papers on this topic, but it is still often unclear which approaches really work and which only lead to overestimated robustness. We start from benchmarking the Linf, L2, and common corruption robustness since these are the most studied settings in the literature.

Evaluation of the robustness to Lp perturbations in general is not straightforward and requires adaptive attacks (Tramer et al., (2020)). Thus, in order to establish a reliable standardized benchmark, we need to impose some restrictions on the defenses we consider. In particular, we accept only defenses that are (1) have in general non-zero gradients wrt the inputs, (2) have a fully deterministic forward pass (i.e. no randomness) that (3) does not have an optimization loop. Often, defenses that violate these 3 principles only make gradient-based attacks harder but do not substantially improve robustness (Carlini et al., (2019)) except those that can present concrete provable guarantees (e.g. Cohen et al., (2019)).

To prevent potential overadaptation of new defenses to AutoAttack, we also welcome external evaluations based on adaptive attacks, especially where AutoAttack flags a potential overestimation of robustness. For each model, we are interested in the best known robust accuracy and see AutoAttack and adaptive attacks as complementary to each other.

RobustBench consists of two parts:

  • a website https://robustbench.github.io/ with the leaderboard based on many recent papers (plots below 👇)
  • a collection of the most robust models, Model Zoo, which are easy to use for any downstream application (see the tutorial below after FAQ 👇)
<!-- <p align="center"><img src="images/aa_robustness_vs_venues.png" height="275"> <img src="images/aa_robustness_vs_years.png" height="275"></p> --> <!-- <p align="center"><img src="images/aa_robustness_vs_reported.png" height="260"> <img src="images/aa_robustness_vs_standard.png" height="260"></p> --> <p align="center"><img src="images/plots_analysis_jsons.png" width="800"></p>

FAQ

Q: How does the RobustBench leaderboard differ from the AutoAttack leaderboard? 🤔
A: The AutoAttack leaderboard was the starting point of RobustBench. Now only the RobustBench leaderboard is actively maintained.

Q: How does the RobustBench leaderboard differ from robust-ml.org? 🤔
A: robust-ml.org focuses on adaptive evaluations, but we provide a standardized benchmark. Adaptive evaluations have been very useful (e.g., see Tramer et al., 2020) but they are also very time-consuming and not standardized by definition. Instead, we argue that one can estimate robustness accurately mostly without adaptive attacks but for this one has to introduce some restrictions on the considered models. However, we do welcome adaptive evaluations and we are always interested in showing the best known robust accuracy.

Q: How is it related to libraries like foolbox / cleverhans / advertorch? 🤔
A: These libraries provide implementations of different attacks. Besides the standardized benchmark, RobustBench additionally provides a repository of the most robust models. So you can start using the robust models in one line of code (see the tutorial below 👇).

Q: Why is Lp-robustness still interesting? 🤔
A: There are numerous interesting applications of Lp-robustness that span transfer learning (Salman et al. (2020), Utrera et al. (2020)), interpretability (Tsipras et al. (2018), Kaur et al. (2019), Engstrom et al. (2019)), security (Tramèr et al. (2018), Saadatpanah et al. (2019)), generalization (Xie et al. (2019), Zhu et al. (2019), Bochkovskiy et al. (2020)), robustness to unseen perturbations (Xie et al. (2019), Kang et al. (2019)), stabilization of GAN training (Zhong et al. (2020)).

Q: What about verified adversarial robustness? 🤔
A: We mostly focus on defenses which improve empirical robustness, given the lack of clarity regarding which approaches really improve robustness and which only make some particular attacks unsuccessful. However, we do not restrict submissions of verifiably robust models (e.g., we have Zhang et al. (2019) in our CIFAR-10 Linf leaderboard). For methods targeting verified robustness, we encourage the readers to check out Salman et al. (2019) and Li et al. (2020).

Q: What if I have a better attack than the one used in this benchmark? 🤔
A: We will be happy to add a better attack or any adaptive evaluation that would complement our default standardized attacks.

Model Zoo: quick tour

The goal of our Model Zoo is to simplify the usage of robust models as much as possible. Check out our Colab notebook here 👉 RobustBench: quick start for a quick introduction. It is also summarized below 👇.

First, install the latest version of RobustBench (recommended):

pip install git+https://github.com/RobustBench/robustbench.git

or the latest stable version of RobustBench (it is possible that automatic downloading of the models may not work):

pip install git+https://github.com/RobustBench/robustbench.git@v1.0

Now let's try to load CIFAR-10 and some quite robust CIFAR-10 models from Carmon2019Unlabeled that achieves 59.53% robust accuracy evaluated with AA under eps=8/255:

from robustbench.data import load_cifar10

x_test, y_test = load_cifar10(n_examples=50)

from robustbench.utils import load_model

model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf')

Let's try to evaluate the robustness of this model. We can use any favourite library for this. For example, FoolBox implements many different attacks. We can start from a simple PGD attack:

!pip install -q foolbox
import foolbox as fb
fmodel = fb.PyTorchModel(model, bounds=(0, 1))

_, advs, success = fb.attacks.LinfPGD()(fmodel, x_test.to('cuda:0'), y_test.to('cuda:0'), epsilons=[8/255])
print('Robust accuracy: {:.1%}'.format(1 - success.float().mean()))
>>> Robust accuracy: 58.0%

Wonderful! Can we do better with a more accurate attack?

Let's try to evaluate its robustness with a cheap version AutoAttack from ICML 2020 with 2/4 attacks (only APGD-CE and APGD-DLR):

# autoattack is installed as a dependency of robustbench so there is not need to install it separately
from autoattack import AutoAttack
adversary = AutoAttack(model, norm='Linf', eps=8/255, version='custom', attacks_to_run=['apgd-ce', 'apgd-dlr'])
adversary.apgd.n_restarts = 1
x_adv = adversary.run_standard_evaluation(x_test, y_
View on GitHub
GitHub Stars772
CategoryEducation
Updated8h ago
Forks102

Languages

Python

Security Score

85/100

Audited on Mar 27, 2026

No findings