Universal adversarial perturbations Seyed-Mohsen Moosavi...

1
Universal adversarial perturbations Seyed-Mohsen Moosavi-Dezfooli*, Alhussein Fawzi + , Omar Fawzi , Pascal Frossard* *EPFL, + UCLA, ENS-Lyon LTS 4 Round number 0 1 2 3 Fooling percentage 93 76 80 79 cash machine pillow computer keyboard dining table envelope medicine chest microwave mosquito net pencil box plate rack quilt refrigerator television tray wardrobe window shade CaffeNet GoogLeNet VGG-16 VGG-19 VGG-F ResNet-152 Random Adv. pert. (DF) Adv. pert. (FGS) Sum of adv. ImageNet bias Universal Fooling percentage of the perturbations with the same norm 85 45 40 34 13 2 Robustness of deep networks For image classifiers, achieving robustness to perturbations is a crucial re- quirement The algorithm Finding universal perturbation using the set X Misclassification rate High misclassification rate on unseen test data What is special in these perturbations? Comparison of universal perturbation to other types of perturbations Doubly universal perturbations The ability to generalize well across different neural networks Universal adversarial perturbation A single perturbation causing misclassification with high probability Feedbacking does not help! Augmenting the training data with perturbed images Explaining universal perturbations The decision boundary in the vicinity of natural images is highly correlated! Perturbed labels Existence of dominant labels that suck most other labels. New theoretical analysis: Check our new work: “Analysis of universal adversarial perturbations” Random noise achieving 90% fooling rate E.g. adversarial perturbations causing misclassification High volume classification regions! ... because universal pertur- bations are diverse! Whale Turtle 93% 78% 94% 78% 81% 85% Ballon Joystick Flag pole Face powder Labrador Labrador Chihuahua Chihuahua 1 50’000 Plot of singular values Random vectors Normals of the decision boundary “Universal adversarial perturbations”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Honolulu, US. github.com/lts4/universal Lampshade Perturbation Classifier Lampshade? ˆ k Algorithm Computation of universal perturbations. 1: input: Data points X , classifier ˆ k , desired p norm of the perturbation ξ , desired accuracy on perturbed sam- ples δ . 2: output: Universal perturbation vector v . 3: Initialize v 0. 4: while Err (X v ) 1 - δ do 5: for each datapoint x i X do 6: if ˆ k (x i + v )= ˆ k (x i ) then 7: Compute the minimal perturbation that sends x i + v to the decision boundary: Δv i arg min r r 2 s.t. ˆ k (x i + v + r ) = ˆ k (x i ). 8: Update the perturbation: v ←P p,ξ (v v i ). 9: end if 10: end for 11: end while x 1,2,3 R 1 R 2 R 3 x 1,2,3 R 1 R 2 R 3 Δ v 1 x 1,2,3 R 1 R 2 R 3 Δ v 1 x 1,2,3 R 1 R 2 v Δ v 2 R 3 Cross-model fooling percentage for VGG-16 VGG-19 73 VGG-F 63 ResNet-152 63 GoogLeNet 56 CaffeNet 56 Round 0 Round 1 Round 2 Round 3

Transcript of Universal adversarial perturbations Seyed-Mohsen Moosavi...

Page 1: Universal adversarial perturbations Seyed-Mohsen Moosavi ...openaccess.thecvf.com/content_cvpr_2017/poster/649_POSTER.pdf · Seyed-Mohsen Moosavi-Dezfooli*, Alhussein Fawzi+, Omar

Universal adversarial perturbationsSeyed-Mohsen Moosavi-Dezfooli*, Alhussein Fawzi+, Omar Fawzi†, Pascal Frossard*

*EPFL, +UCLA,†ENS-LyonLTS

4

Round number0 1 2 3

Fooling percentage

93

76 80 79

cashmachine

pillow

computerkeyboard

diningtable

envelopemedicinechest

microwave

mosquitonet

pencilbox

platerack

quilt

refrigerator

television

tray

wardrobe

windowshade

CaffeNet GoogLeNet

VGG-16 VGG-19

VGG-F

ResNet-152

Random

Adv. pert. (DF)

Adv. pert. (FGS)

Sum of adv.

ImageNet bias

Universal

Fooling percentage of the perturbations with the same norm

8545

4034

132

Robustness of deep networksFor image classifiers, achieving robustness to perturbations is a crucial re-quirement

The algorithmFinding universal perturbation using the set X

Misclassification rateHigh misclassification rate on unseen test data

What is special in these perturbations?Comparison of universal perturbation to other types of perturbations

Doubly universal perturbationsThe ability to generalize well across different neural networks

Universal adversarial perturbationA single perturbation causing misclassification with high probability

Feedbacking does not help!Augmenting the training data with perturbed images

Explaining universal perturbationsThe decision boundary in the vicinity of natural images is highly correlated!

Perturbed labelsExistence of dominant labels that suck most other labels.

New theoretical analysis:Check our new work: “Analysis of universal adversarial perturbations”

Random noise achieving 90% fooling rate

E.g. adversarial perturbations causing misclassification

High volume classification regions!

... because universal pertur-bations are diverse!

Whale Turtle

93%

78%

94%

78%

81%

85%

BallonJoystick Flag pole

Face powder

Labrador

LabradorChihuahua

Chihuahua1 50’000

Plot of singular values

Random vectors

Normals of the decision boundary

“Universal adversarial perturbations”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Honolulu, US.

github.com/lts4/universal

Lampshade

PerturbationClassifier Lampshade?k̂

Algorithm Computation of universal perturbations.

1: input: Data points X , classifier k̂, desired �p norm ofthe perturbation ξ, desired accuracy on perturbed sam-ples δ.

2: output: Universal perturbation vector v.3: Initialize v ← 0.4: while Err (Xv) ≤ 1− δ do5: for each datapointxi ∈ X do6: if k̂(xi + v) = k̂(xi) then7: Compute the minimal perturbation that

sendsxi + v to the decision boundary:

∆vi ← argminr

‖r‖2 s.t. k̂(xi + v + r) �= k̂(xi).

8: Update the perturbation:

v ← Pp,ξ(v +∆vi).

9: end if10: end for11: end while

x1,2,3

R1

R2

R3

x1,2,3

R1

R2

R3

∆v 1

x1,2,3

R1

R2

R3∆v 1

x1,2,3

R1

R2

v∆v 2

R3

Cross-model fooling percentage for VGG-16

VGG-19

73

VGG-F

63

ResNet-152

63

GoogLeNet

56

CaffeNet

56

Round 0 Round 1

Round 2 Round 3