Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET....
Transcript of Anomaly Detection in Streaming Graphs€¦ · ANOMALY DETECTION IN STREAMING GRAPHS ESWARAN ET....
Anomaly Detection in Streaming Graphs
Joint work with Christos Faloutsos, Sudipto Guha and Nina Mishra (initially presented at SIGKDD 2018)
CyLab Partners Conference September 25, 2019
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Graphs are being created everywhere
�2
INTRODUCTION
You Alice
25 Sep 2019, 2.20pm
………
………
………
………………
………
………………
………
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Many other settings…
�3
INTRODUCTION
IM/e-mail networks Computer networks
Transportation networks Edit networks
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
As a sequence of graph snapshots
�4
INTRODUCTION
time
Monday PM Tuesday PM
Monday AM Tuesday AM Wednesday AMMORNINGS
NIGHTS
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
But sometimes unusual events happen
�5
INTRODUCTION
NormalTax scamNetwork failure
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Unusual events in other settings
�6
INTRODUCTION
Computer networks (e.g., port scans,
denial-of-service)Transportation networks (events/weather)
stadium
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
How do we detect such anomalies in streaming graphs?
�7
INTRODUCTION
How do we even characterize these anomalies?
PROBLEM
ALGORITHM
GUARANTEES
EXPERIMENTS
INSIGHT
INSIGHT
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Anomalies tend to involve…
�9
INSIGHT
sudden (dis)appearance of a large dense directed subgraph
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
sudden (dis)appearance of large dense directed subgraph
�10
INSIGHT
sourcessources
destinationsdestinations
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018 �11
TEXT
sudden (dis)appearance of large dense directed subgraph
sources
destinationsmany vertices
many many edges
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018 �12
INSIGHT
sudden (dis)appearance of large dense directed subgraph
steady evolution?
suddeninitial final
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018 �13
INSIGHT
appearance disappearance
sudden (dis)appearance of large dense directed subgraph
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018 �14
PROBLEM
time
Ok! Ok!
Ok! anomaly!
• (Un)directed weighted edges • Time-evolving vertex set
STREAMING MODEL
• Real-time and fast detection • Bounded working memory
CONSTRAINTS
PROBLEM
ALGORITHM
GUARANTEES
EXPERIMENTS
INSIGHT
ALGORITHM
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Overview of SpotLight
�16
ALGORITHM
Graph
Sketching
v(G3)
v(G1)
v(G2) v(G4)
G1
G3 G4
G2
anomaly! v(G3)
v(G1)
v(G2) v(G4)
Anomaly
Detection
Many off-the-shelf methods for anomaly detection:
‣ Robust Random Cut Forests [Guha, Mishra, Roy & Schrijvers; ICML 2016]
‣ Light-weight Online Detector of Anomalies [Pevny; ML 2016]
‣ Randomized Space Forests [Wu, Zhang, Fan, Edwards & Yu; ICDM 2014]
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
SpotLight randomized graph sketching
�17
ALGORITHM
0 100 20
THREE PARAMETERS:
‣ Probability of sampling source ‘p’ ‣ Probability of sampling destination ‘q’ ‣ Number of sketching dimensions ‘K’
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
SpotLight at work on a stream
�18
ALGORITHM
STREAMING ANOMALY DETECTOR
Hashes: hS, hS, hS: src → {1,.., 1/p} & hD, hD, hD: dst → {1,.., 1/q}
anom
aly s
core
time
b
a1
c
b2
time5pm 6pm
a
d2
a
a1
b
c1
7pm
0 0 05-6pm 6-7pm
0 0 10 2 30 0 01 0 20 0 0
ahS hS hS
bhD hD hD
bhS hS hS
chD hD hD
PROBLEM
ALGORITHM
GUARANTEES
EXPERIMENTS
INSIGHT
GUARANTEES
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Key intuition
�20
GUARANTEES
G GBGR
v(GR)
v(GB)
K-dim SpotLight Space
v(G)dR
dB dR - dB > O(K m2)
Thought Experiment: Add ‘m’ edges.
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Anomaly detection guarantee
�21
GUARANTEES
anomalynormal
distance
proba
bility
dB
False Positive Rate ≤ 𝛅
𝛜
➡ Pr[dR-dB > 𝛜] ≥ 1-𝛅sketch size, K ≥ K*
decision threshold
dR
PROBLEM
ALGORITHM
GUARANTEES
EXPERIMENTS
INSIGHT
EXPERIMENTS
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
The labeled DARPA data
�23
EXPERIMENTS
4.5M edges in 87.7K time ticks 9.5K sources, 24K destinations Edges labeled as attack/not
Stream of 1.5K hourly graphs(24% anomalous)
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Better intrusion detection
�24
EXPERIMENTS
#graphs correctly flagged
#graphs flaggedPrecision =
#graphs correctly flagged
#anomalous graphsRecall =
RHSS: (Ranshous, Harenburg, Sharma & Samatova, SDM 2016)STA: Streaming Tensor Analysis (Sun, Tao & Faloutsos, KDD 2006)
PROBLEM
ALGORITHM
GUARANTEES
EXPERIMENTS
INSIGHT
CONCLUSION
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Summary
26
CONCLUSION
Memory efficient Theoretical guaranteesReal-time
Ok!
anomaly!
Ok! Ok! time
PROBLEM
SpotLight sketching
SOLUTION
CYLAB PARTNERS CONFERENCE 2019
ANOMALY DETECTION IN STREAMING GRAPHS
ESWARAN ET. AL., KDD 2018
Many open challenges
�27
CONCLUSION
‣ Identify the vertices responsible for the anomaly
‣ Side-information (attributes) about vertices and edges
‣ Identify anomalies as soon as a new edge (interaction) occurs
‣ Leverage labeled data where available
Thank you! [email protected]
http://www.cs.cmu.edu/~deswaran/