Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the...

25
Alexandre Boumezoued, Milliman R&D OICA, 29 April 2020 Joint work with Caroline Hillairet (ENSAE) and Yannick Bessy-Roland (AXA GIE) Predicting Cyber-Attacks using Hawkes Processes

Transcript of Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the...

Page 1: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Alexandre Boumezoued, Milliman R&D

OICA, 29 April 2020

Joint work with Caroline Hillairet (ENSAE) and Yannick Bessy-Roland (AXA GIE)

Predicting Cyber-Attacks using

Hawkes Processes

Page 2: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

2

Agenda

1 Introduction

2 Data breaches dataset

3 Hawkes model

4 Fitting and prediction

5 References

Page 3: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Attack that occurs on the same day a weakness is discovered in software. At that point, it's exploited before a fix becomes available from its creator

The attacker secretly relays and possibly alters the communications between two parties who believe they are directly communicating with each other

Introduction

3

Example of types of cyber attacks

The attacker sends a document

appearing reliable (mainly e-mail) in

order to collect sensitive information

Click to edit Master text styles

Man in the middle

Zero day

Denial of service

MalwarePhishing

• Definition: The risk of an attack on digital data as well as the consequences on the information system

Software that is specifically designed to disrupt, damage, or gain unauthorized access to a computer system

Attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by disrupting services of a host connected to the Internet

Page 4: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

4

IntroductionCyber insurance covers

Damage

Crisis

management

Third party

Cyber insurance

Crisis management:

Costs of investigation

Costs of assistance

Other costs of crisis management

Damage:

Data cleaning

Data restauration

Payment of ransom

Operating loss

Third party liability:

Virus transmission

Personal liability insurance

Denial of service

Page 5: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

5

Agenda

1 Introduction

2 Data breaches dataset

3 Hawkes model

4 Fitting and prediction

5 References

Page 6: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Name of the covered entity

Type of the covered entity:Healthcare provider, businesses…

6

Data breaches datasetThe Privacy Rights Clearinghouse database

PRC

Database

Description

A public database that contains 8871 data breaches in the US over the period 2005-2019

https://www.privacyrights.org/data-breaches

Covariates

Those used in the results presented are highlighted in blue

Total records breached

Date Made Public: day/month/year

Type of Breach : « Hacking / IT

incident » , « Theft »

Localization of

the breached

entity: state/city

Page 7: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

7

Data breaches datasetDataset description: types of breaches

Page 8: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

8

Data breaches datasetDataset description: types of targets

Page 9: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

9

A majority of Theft/Loss and Hacking/Malware

21% of Unintented disclosure

A majority in Healthcare/Medical

Businesses are well represented too

Data breaches datasetDescriptive statistics over 2010-2018

Page 10: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Data breaches datasetCyber attacks frequencies by type and organization

10

Apparent clusteringby type of attacks

Deterministic trendsor stochastic regimes?

Apparent clusteringby type of organization attacked

No clear trends

Page 11: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Data breaches datasetAutocorrelation of the number of incidents

11

R-squared : 0.726

Confidence interval (95%)[0.687, 0.766]

R-squared : 0.718

Confidence interval (95%)[0.702, 0.735]

Regression of the number of event during the following month 𝒕 + 𝟏 as a function of the number of event during the current month 𝑡 → should be independent for a Poisson process model to be valid

Autocorrelation dramatically increases when focusing on attacks and/or organizations of the same type

R-squared : 0.154

Confidence interval (95%)[0.030, 0.278]

R-squared : 0.780

Confidence interval (95%)[0.750, 0.810]

Page 12: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

12

Agenda

1 Introduction

2 Data breaches dataset

3 Hawkes model

4 Fitting and prediction

5 References

Page 13: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Hawkes modelChoice of the Hawkes model

Taking into account autocorrelation

Cox model : Poisson model with stochastic intensity → difficulty to specify the stochastic intensity dynamics

Hawkes model : Self-exciting model with stochastic intensity, fully specified by the point process itself

Choice of the Hawkes model:

Self-excitation: every event increases the probability for a new event to occur within a given group (same

organization or attack type)

Clustering: the self-exciting property allows to model cluster effect (groups of attacks – same origin!)

Inter-excitation: in the case of multi-dimensional Hawkes process, every attack in one group increases the

occurrence probability of new events in the other groups

Related references:

Peng et al. (2017), Baldwin et al. (2017)

13

Page 14: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

A Hawkes process with exponential kernel is a counting process 𝑁𝑡 = σ𝑛≥1 1𝑇𝑛≤𝑡 with intensity:

𝜆 𝑡 = 𝜇 𝑡 +

𝑇𝑛<𝑡

𝛼 exp(−𝛽(𝑡 −𝑇𝑛))

𝜇:ℝ+ → ℝ+is a deterministic baseline intensity

The sum represents the impact of past events; it captures the self-excitation property

Hawkes modelUnivariate Hawkes process

14

Each jump represents an attack

Clustering phenomena

Intensity decreases exponentially between jumps

𝜆 𝑡 𝑁𝑡

Page 15: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Multivariate Hawkes process - presentation

Multivariate Hawkes process allows to model interactions between types of entities/attacks/states:

𝑁𝑡1

𝑡≥0, … , 𝑁𝑡

𝐾

𝑡≥0, 𝐾 counting processes with jump times 𝑇𝑛

(1)

𝑛≥1, … , 𝑇𝑛

(𝐾)

𝑛≥1

The intensity process with exponential kernel of the counting process (𝑖) is defined as:

15

Matrix of excitation:

𝛼 =𝛼1,1 𝛼1,2𝛼2,1 𝛼2,2

=0.0 0.990.0 0.90

Group 2 is purely self-excited

Group 1 is fully influenced by Group 2

Hawkes model

𝜆𝑖 𝑡 = 𝜇𝑖(𝑡) +

𝑗=1

𝐾

𝑇𝑛𝑗<𝑡

𝛼𝑖,𝑗 exp −𝛽𝑖,𝑗(𝑡 − 𝑇𝑛𝑗)

Group 1 self-excitation

Impact of Group 2 on Group 1

Impact of Group 𝑗 on Group 𝑖

Page 16: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Hawkes modelMultivariate Hawkes process - kernels

16

« Classical » exponential kernel:𝜙𝑖,𝑗 𝑠 = 𝛼𝑖,𝑗 exp −𝛽𝑖,𝑗𝑠

Instantaneous excitation Complexity: we assume all 𝛽𝑖,𝑗 (excitation

memory of 𝑖 from 𝑗) depend on the groups and are to be calibrated

The intensity process is not Markov (for dimension ≥ 2)

Kernel with delay: 𝜙𝑖,𝑗 𝑠 = 𝛼𝑖,𝑗𝑠 exp −𝛽𝑖𝑠 The intensity process is not Markov (even

in dimension 1) Complexity: we assume all 𝛽𝑖 (excitation

memory of 𝑖 from any other group) depend on the groups and are to be calibrated

Kernels Intensity process

Development of closed-form formulas for the expected number of claims for such multivariate Hawkes processes

Page 17: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

17

Agenda

1 Introduction

2 Data breaches dataset

3 Hawkes model

4 Fitting and prediction

5 References

Page 18: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

18

Fitting and predictionData Grouping

Crossing variables: attack type, sector, state

Retaining groups with more than 200 attacks and remaining in OTHER

Total: six segments

18

Page 19: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

19

Fitting and predictionModel specification

The three Hawkes kernels considered:

Baseline intensity: 𝝁𝒊 𝒕 = 𝝁𝟎,𝒊 + 𝜸𝒊𝒕 to account for trends in the dataset

Possibility to add a Lasso penalty (not discussed here)

𝐿 𝜇, 𝜙 𝑝𝑒𝑛𝑎𝑙𝑖𝑧𝑒𝑑 = 𝐿(𝜇, 𝜙) − 𝝂σ𝟏≤𝒊,𝒋≤𝒅 |𝜶𝒊,𝒋|

Reduce complexity

Improve prediction capacity

19

Kernel 1 (exp) Kernel 2 (exp) Kernel 3 (delay)

𝜙𝑖,𝑗 𝑠 𝛼𝑖,𝑗exp(−𝜷𝒊𝑠) 𝛼𝑖,𝑗exp(−𝜷𝒊,𝒋𝑠) 𝛼𝑖,𝑗𝐬exp(−𝜷𝒊𝑠)

Nb parameters 54 84 54

Page 20: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Likelihood: kernel 3 with delay best fits the data

Adequacy tests (Kolmogorov-Smirnov): adequacy is satisfactory, except for group (4)

20

Fitting and prediction Calibration results

Kernel 1

(exp)

Kernel 2

(exp)

Kernel 3

(delay)

𝜙𝑖,𝑗 𝑠 𝛼𝑖,𝑗exp(−𝜷𝒊𝑠) 𝛼𝑖,𝑗exp(−𝜷𝒊,𝒋𝑠) 𝛼𝑖,𝑗𝐬exp(−𝜷𝒊𝑠)

Nb parameters 54 84 54

-Likelihood (2011-2015) 6513 6172 6153

-Likelihood (2011-2016) 7639 7516 7485

Page 21: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Parameter estimates (kernel 3):

21

Fitting and prediction Calibration analysis

Matrix of ratios between maximal

excitation and baseline intensity:

Same orders of magnitude

Partly captures the historical trend

Captures the baseline non-excited intensity

Captures the major self/externalinteractions

Strong self-excitation in theseMED groups

Strong reciprocalself-excitation of groups (2) and (4)

«Causal» excitation of (2) by (5)

Page 22: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Fitting and prediction Out-of-sample prediction results for 2017 (kernel 3)

22

Simulation based on the thinning algorithm for point processes

Predictions with mean and (0.5%, 99.5%) percentiles

Joint prediction of all groups capturing the causal and asymmetric interactions

Parameter uncertainty can be added

Page 23: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

23

Agenda

1 Introduction

2 Data breaches dataset

3 Hawkes model

4 Fitting and prediction

5 References

Page 24: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

References

Baldwin, A., Gheyas, I., Ioannidis, C., Pym, D., & Williams, J. (2017). Contagion in cyber security attacks. Journal of the

Operational Research Society, 68(7), 780-791.

Böhme, R., & Kataria, G. (2006, June). Models and Measures for Correlation in Cyber-Insurance. In WEIS.

Boumezoued, A. (2016). Population viewpoint on Hawkes processes. Advances in Applied Probability, 48(2), 463-480.

Daley, D. J., & Vere-Jones, D. (2007). An introduction to the theory of point processes: volume II: general theory and

structure. Springer Science & Business Media.

Edwards, B., Hofmeyr, S., & Forrest, S. (2016). Hype and heavy tails: A closer look at data breaches. Journal of

Cybersecurity, 2(1), 3-14.

Hawkes, Alan G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58(1), 83-90.

Hawkes, Alan G, David Oakes. (1974). A cluster process representation of a self-exciting process. J. of Applied Probability

493–503.

Oakes, David. (1975). The Markovian self-exciting process. Journal of Applied Probability 69–77.

Peng, C., Xu, M., Xu, S., & Hu, T. (2017). Modeling and predicting extreme cyber attack rates via marked point processes.

Journal of Applied Statistics, 44(14), 2534-2563.

Xu, M., & Hua, L. (2019). Cybersecurity Insurance: Modeling and Pricing. North American Actuarial Journal, 1-30.

24

Page 25: Predicting Cyber-Attacks using Hawkes Processes · Data breaches dataset Autocorrelation of the number of incidents 11 R-squared : 0.726 Confidence interval (95%) [0.687,0.766] R-squared

Disclaimer

25

This presentation presents information of a general nature. It is not intended to guide or determine any specific individual situation and Milliman recommends that users of this presentation will seek explanation and/or amplification of any part of the presentation that they consider not to be clear. Neither the presenter nor the presenter's employer shall have any responsibility or liability to any person or entity with respect to damages alleged to have been caused directly or indirectly by the content of this presentation. All persons who choose to rely in any way on the contents of this presentation do so entirely at their own risk.The contents of this presentation are confidential and must not be modified, copied, quoted, distributed or shown to any other parties without Milliman's prior written consent. Copyright © Milliman 2020. All rights reserved