SECURITY AND PRIVACY OF MACHINE LEARNING - Ian …MACHINE LEARNING Staff Research Scientist Google...

Post on 07-Aug-2020

6 views 0 download

Transcript of SECURITY AND PRIVACY OF MACHINE LEARNING - Ian …MACHINE LEARNING Staff Research Scientist Google...

SESSION ID:

#RSAC

Ian Goodfellow

SECURITY AND PRIVACY OF MACHINE LEARNING

Staff Research ScientistGoogle Brain@goodfellow_ian

(Goodfellow 2018)

#RSAC

Machine Learning and Security

2

yy

hh

xx

W

wyyh1

h1

x1

x1

h2

h2

x2

x2

Machine Learning for Security

Malware detection Intrusion detection …

Security against Machine Learning

yy

hh

xx

W

wyyh1

h1

x1

x1

h2

h2

x2

x2

Password guessing Fake reviews …

(Goodfellow 2018)

#RSAC

Security of Machine Learning

3

yy

hh

xx

W

wyyh1

h1

x1

x1

h2

h2

x2

x2

(Goodfellow 2018)

#RSAC

An overview of a field

4

This presentation summarizes the work of many people, not just my own / my collaborators

Download the slides for this link to extensive references

The presentation focuses on the concepts, not the history or the inventors

(Goodfellow 2018)

#RSAC

Machine Learning Pipeline

5

X ✓x

y

Training data

Learning algorithmLearned parameters

Test input

Test output

(Goodfellow 2018)

#RSAC

Privacy of Training Data

6

X ✓ X

(Goodfellow 2018)

#RSAC

Defining (ε, δ)-Differential Privacy

7

(Abadi 2017)

(Goodfellow 2018)

#RSAC

Private Aggregation of Teacher Ensembles

8

(Papernot et al 2016)

(Goodfellow 2018)

#RSAC

Training Set Poisoning

9

xX ✓ y

(Goodfellow 2018)

#RSAC

ImageNet Poisoning

10

(Koh and Liang 2017)

(Goodfellow 2018)

#RSAC

Adversarial Examples

11

X ✓

x

y

(Goodfellow 2018)

#RSAC

Model Theft

12

X ✓x

y✓

(Goodfellow 2018)

#RSAC

Model Theft++

13

X ✓x

y✓Xx

(Goodfellow 2018)

#RSAC

Deep Dive on Adversarial Examples

14

...solving CAPTCHAS and reading addresses...

...recognizing objects and faces….

(Szegedy et al, 2014)

(Goodfellow et al, 2013)

(Taigmen et al, 2013)

(Goodfellow et al, 2013)

and other tasks...

Since 2013, deep neural networks have matched human performance at...

(Goodfellow 2018)

#RSAC

Adversarial Examples

15

(Goodfellow 2018)

#RSAC

Turning objects into airplanes

16

(Goodfellow 2018)

#RSAC

Attacking a linear model

17

(Goodfellow 2018)

#RSAC

Wrong almost everywhere

18

(Goodfellow 2018)

#RSAC

Cross-model, cross-dataset transfer

19

(Goodfellow 2018)

#RSAC

Transfer across learning algorithms

20

(Papernot 2016)

(Goodfellow 2018)

#RSAC

Transfer attack

21

Train your own model

Target model with unknown weights, machine learning

algorithm, training set; maybe non-differentiable

Substitute model mimicking target

model with known, differentiable function

Adversarial examples

Adversarial crafting against substitute

Deploy adversarial examples against the target; transferability

property results in them succeeding

(Goodfellow 2018)

#RSAC

Enhancing Transfer with Ensembles

22

(Liu et al, 2016)

(Goodfellow 2018)

#RSAC

Transfer to the Human Brain

23

(Elsayed et al, 2018)

(Goodfellow 2018)

#RSAC

Transfer to the Physical World

24

(Kurakin et al, 2016)

(Goodfellow 2018)

#RSAC

Adversarial Training

25

0 50 100 150 200 250 300

Training time (epochs)

10�2

10�1

100

Tes

tm

iscl

ass

ifica

tion

rate Train=Clean, Test=Clean

Train=Clean, Test=Adv

Train=Adv, Test=Clean

Train=Adv, Test=Adv

(Goodfellow 2018)

#RSAC

Adversarial Training vs Certified Defenses

26

Adversarial Training: Train on adversarial examples This minimizes a lower bound on the true worst-case error Achieves a high amount of (empirically tested) robustness on small to medium datasets

Certified defenses Minimize an upper bound on true worst-case error Robustness is guaranteed, but amount of robustness is small Verification of models that weren’t trained to be easy to verify is hard

(Goodfellow 2018)

#RSAC

Limitations of defenses

27

Even certified defenses so far assume unrealistic threat model

Typical model: attacker can change input within some norm ball

Real attacks will be stranger, hard to characterize ahead of time (Brown et al., 2017)

(Goodfellow 2018)

#RSAC

Clever Hans

28

(“Clever Hans, Clever Algorithms,”

Bob Sturm)

(Goodfellow 2018)

#RSAC

Get involved!

29

https://github.com/tensorflow/cleverhans

(Goodfellow 2018)

#RSAC

Apply What You Have Learned

30

Publishing an ML model or a prediction API? Is the training data sensitive? -> train with differential privacy

Consider how an attacker could cause damage by fooling your model Current defenses are not practical Rely on situations with no incentive to cause harm / limited amount of potential harm