Universal adversarial perturbations Seyed-Mohsen Moosavi...

Universal adversarial perturbationsSeyed-Mohsen Moosavi-Dezfooli*, Alhussein Fawzi+, Omar Fawzi†, Pascal Frossard*

*EPFL, +UCLA,†ENS-LyonLTS

4

Round number0 1 2 3

Fooling percentage

93

76 80 79

cashmachine

pillow

computerkeyboard

diningtable

envelopemedicinechest

microwave

mosquitonet

pencilbox

platerack

quilt

refrigerator

television

tray

wardrobe

windowshade

CaffeNet GoogLeNet

VGG-16 VGG-19

VGG-F

ResNet-152

Random

Adv. pert. (DF)

Adv. pert. (FGS)

Sum of adv.

ImageNet bias

Universal

Fooling percentage of the perturbations with the same norm

8545

4034

132

Robustness of deep networksFor image classifiers, achieving robustness to perturbations is a crucial re-quirement

The algorithmFinding universal perturbation using the set X

Misclassification rateHigh misclassification rate on unseen test data

What is special in these perturbations?Comparison of universal perturbation to other types of perturbations

Doubly universal perturbationsThe ability to generalize well across different neural networks

Universal adversarial perturbationA single perturbation causing misclassification with high probability

Feedbacking does not help!Augmenting the training data with perturbed images

Explaining universal perturbationsThe decision boundary in the vicinity of natural images is highly correlated!

Perturbed labelsExistence of dominant labels that suck most other labels.

New theoretical analysis:Check our new work: “Analysis of universal adversarial perturbations”

Random noise achieving 90% fooling rate

E.g. adversarial perturbations causing misclassification

High volume classification regions!

... because universal pertur-bations are diverse!

Whale Turtle

93%

78%

94%

78%

81%

85%

BallonJoystick Flag pole

Face powder

Labrador

LabradorChihuahua

Chihuahua1 50’000

Plot of singular values

Random vectors

Normals of the decision boundary

“Universal adversarial perturbations”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Honolulu, US.

github.com/lts4/universal

Lampshade

PerturbationClassifier Lampshade?k̂

Algorithm Computation of universal perturbations.

1: input: Data points X , classifier k̂, desired �p norm ofthe perturbation ξ, desired accuracy on perturbed sam-ples δ.

2: output: Universal perturbation vector v.3: Initialize v ← 0.4: while Err (Xv) ≤ 1− δ do5: for each datapointxi ∈ X do6: if k̂(xi + v) = k̂(xi) then7: Compute the minimal perturbation that

sendsxi + v to the decision boundary:

∆vi ← argminr

‖r‖2 s.t. k̂(xi + v + r) �= k̂(xi).

8: Update the perturbation:

v ← Pp,ξ(v +∆vi).

9: end if10: end for11: end while

x1,2,3

R1

R2

R3

x1,2,3

R1

R2

R3

∆v 1

x1,2,3

R1

R2

R3∆v 1

x1,2,3

R1

R2

v∆v 2

R3

Cross-model fooling percentage for VGG-16

VGG-19

73

VGG-F

63

ResNet-152

63

GoogLeNet

56

CaffeNet

56

Round 0 Round 1

Round 2 Round 3

Universal adversarial perturbations Seyed-Mohsen Moosavi...

Documents

Transcript of Universal adversarial perturbations Seyed-Mohsen Moosavi...