Václav Voráček presents Some (Optimization) Problems in Certifiable Adversarial Robustness

On 2024-06-11 11:00 at G205, Karlovo náměstí 13, Praha 2

(Deep) learning systems are notoriously vulnerable to small adversarial
perturbations. There are ways to construct provably robust classifiers, and we
will review some of them, in particular randomized smoothing and nearest
prototype classifiers. The talk will be based on the following three papers and
the focus will be on the underlying optimization problems.

https://proceedings.mlr.press/v162/voracek22a
Nearest prototype classifiers (NPCs) assign to each input point the label of
the
nearest prototype with respect to a chosen distance metric. A direct advantage
of NPCs is that the decisions are interpretable. We provide a complete
characterization on the complexity when using ℓp distances for decision and
ℓq threat models for certification for p,q∈{1,2,∞}.

https://proceedings.mlr.press/v202/voracek23a
Randomized smoothing is a popular method to certify robustness of image
classifiers to adversarial input perturbations. It is the only certification
technique which scales directly to datasets of higher dimension such as
ImageNet. We derive new certification formulae which lead to significant
improvements in the certified ℓ1-robustness

https://openreview.net/forum?id=HaHCoGcpV9
Randomized smoothing is sound when using infinite precision. However, we show
that randomized smoothing is no longer sound for limited floating-point
precision and show how this can be abused to give false certificates. We
discuss
the implicit assumptions of randomized smoothing and show that they do not
apply
to generic image classification models whose smoothed versions are commonly
certified. In order to overcome this problem, we propose a sound approach to
randomized smoothing when using floating-point precision.