Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET....

27
Anomaly Detection in Streaming Graphs Joint work with Christos Faloutsos, Sudipto Guha and Nina Mishra (initially presented at SIGKDD 2018) CyLab Partners Conference September 25, 2019

Transcript of Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET....

Page 1: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

Anomaly Detection in Streaming Graphs

Joint work with Christos Faloutsos, Sudipto Guha and Nina Mishra (initially presented at SIGKDD 2018)

CyLab Partners Conference September 25, 2019

Page 2: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Graphs are being created everywhere

�2

INTRODUCTION

You Alice

25 Sep 2019, 2.20pm

………

………

………

………………

………

………………

………

Page 3: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Many other settings…

�3

INTRODUCTION

IM/e-mail networks Computer networks

Transportation networks Edit networks

Page 4: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

As a sequence of graph snapshots

�4

INTRODUCTION

time

Monday PM Tuesday PM

Monday AM Tuesday AM Wednesday AMMORNINGS

NIGHTS

Page 5: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

But sometimes unusual events happen

�5

INTRODUCTION

NormalTax scamNetwork failure

Page 6: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Unusual events in other settings

�6

INTRODUCTION

Computer networks (e.g., port scans,

denial-of-service)Transportation networks (events/weather)

stadium

Page 7: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

How do we detect such anomalies in streaming graphs?

�7

INTRODUCTION

How do we even characterize these anomalies?

Page 8: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

PROBLEM

ALGORITHM

GUARANTEES

EXPERIMENTS

INSIGHT

INSIGHT

Page 9: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Anomalies tend to involve…

�9

INSIGHT

sudden (dis)appearance of a large dense directed subgraph

Page 10: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

sudden (dis)appearance of large dense directed subgraph

�10

INSIGHT

sourcessources

destinationsdestinations

Page 11: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018 �11

TEXT

sudden (dis)appearance of large dense directed subgraph

sources

destinationsmany vertices

many many edges

Page 12: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018 �12

INSIGHT

sudden (dis)appearance of large dense directed subgraph

steady evolution?

suddeninitial final

Page 13: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018 �13

INSIGHT

appearance disappearance

sudden (dis)appearance of large dense directed subgraph

Page 14: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018 �14

PROBLEM

time

Ok! Ok!

Ok! anomaly!

• (Un)directed weighted edges • Time-evolving vertex set

STREAMING MODEL

• Real-time and fast detection • Bounded working memory

CONSTRAINTS

Page 15: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

PROBLEM

ALGORITHM

GUARANTEES

EXPERIMENTS

INSIGHT

ALGORITHM

Page 16: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Overview of SpotLight

�16

ALGORITHM

Graph

Sketching

v(G3)

v(G1)

v(G2) v(G4)

G1

G3 G4

G2

anomaly! v(G3)

v(G1)

v(G2) v(G4)

Anomaly

Detection

Many off-the-shelf methods for anomaly detection:

‣ Robust Random Cut Forests [Guha, Mishra, Roy & Schrijvers; ICML 2016]

‣ Light-weight Online Detector of Anomalies [Pevny; ML 2016]

‣ Randomized Space Forests [Wu, Zhang, Fan, Edwards & Yu; ICDM 2014]

Page 17: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

SpotLight randomized graph sketching

�17

ALGORITHM

0 100 20

THREE PARAMETERS:

‣ Probability of sampling source ‘p’ ‣ Probability of sampling destination ‘q’ ‣ Number of sketching dimensions ‘K’

Page 18: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

SpotLight at work on a stream

�18

ALGORITHM

STREAMING ANOMALY DETECTOR

Hashes: hS, hS, hS: src → {1,.., 1/p} & hD, hD, hD: dst → {1,.., 1/q}

anom

aly s

core

time

b

a1

c

b2

time5pm 6pm

a

d2

a

a1

b

c1

7pm

0 0 05-6pm 6-7pm

0 0 10 2 30 0 01 0 20 0 0

ahS hS hS

bhD hD hD

bhS hS hS

chD hD hD

Page 19: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

PROBLEM

ALGORITHM

GUARANTEES

EXPERIMENTS

INSIGHT

GUARANTEES

Page 20: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Key intuition

�20

GUARANTEES

G GBGR

v(GR)

v(GB)

K-dim SpotLight Space

v(G)dR

dB dR - dB > O(K m2)

Thought Experiment: Add ‘m’ edges.

Page 21: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Anomaly detection guarantee

�21

GUARANTEES

anomalynormal

distance

proba

bility

dB

False Positive Rate ≤ 𝛅

𝛜

➡ Pr[dR-dB > 𝛜] ≥ 1-𝛅sketch size, K ≥ K*

decision threshold

dR

Page 22: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

PROBLEM

ALGORITHM

GUARANTEES

EXPERIMENTS

INSIGHT

EXPERIMENTS

Page 23: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

The labeled DARPA data

�23

EXPERIMENTS

4.5M edges in 87.7K time ticks 9.5K sources, 24K destinations Edges labeled as attack/not

Stream of 1.5K hourly graphs(24% anomalous)

Page 24: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Better intrusion detection

�24

EXPERIMENTS

#graphs correctly flagged

#graphs flaggedPrecision =

#graphs correctly flagged

#anomalous graphsRecall =

RHSS: (Ranshous, Harenburg, Sharma & Samatova, SDM 2016)STA: Streaming Tensor Analysis (Sun, Tao & Faloutsos, KDD 2006)

Page 25: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

PROBLEM

ALGORITHM

GUARANTEES

EXPERIMENTS

INSIGHT

CONCLUSION

Page 26: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Summary

26

CONCLUSION

Memory efficient Theoretical guaranteesReal-time

Ok!

anomaly!

Ok! Ok! time

PROBLEM

SpotLight sketching

SOLUTION

Page 27: Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET. AL., KDD 2018 Many open challenges 27 CONCLUSION ‣ Identify the vertices responsible

CYLAB PARTNERS CONFERENCE 2019

ANOMALY DETECTION IN STREAMING GRAPHS

ESWARAN ET. AL., KDD 2018

Many open challenges

�27

CONCLUSION

‣ Identify the vertices responsible for the anomaly

‣ Side-information (attributes) about vertices and edges

‣ Identify anomalies as soon as a new edge (interaction) occurs

‣ Leverage labeled data where available

Thank you! [email protected]

http://www.cs.cmu.edu/~deswaran/