Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary...

87
CMU SCS Anomaly detection in large graphs Christos Faloutsos CMU

Transcript of Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary...

Page 1: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Anomaly detection in large graphs

Christos FaloutsosCMU

Page 2: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Thank you!

• Prof. Richard Chbeir

• Prof. Flavius Frasincar

ICWE 2021 Christos Faloutsos 2

Page 3: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 3

Roadmap

• Introduction – Motivation– Why study (big) graphs?

• Part#1: Patterns in graphs• Part#2: time-evolving graphs; tensors• Conclusions

ICWE 2021

Page 4: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 4

Graphs - why should we care?

>$10B; ~1B users

ICWE 2021

Page 5: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 5

Graphs - why should we care?

Internet Map [lumeta.com]

Food Web [Martinez ’91]

ICWE 2021

Page 6: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 6

Graphs - why should we care?• web-log (‘blog’) news propagation• computer network security: email/IP traffic and

anomaly detection• Recommendation systems• ....

• Many-to-many db relationship -> graph

ICWE 2021

Page 7: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Motivating problems• P1: patterns? Fraud detection?

• P2: patterns in time-evolving graphs / tensors

ICWE 2021 Christos Faloutsos 7

timesource

destination

Page 8: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Motivating problems• P1: patterns? Fraud detection?

• P2: patterns in time-evolving graphs / tensors

ICWE 2021 Christos Faloutsos 8

timesource

destination

Patterns anomalies

Page 9: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 9

Roadmap

• Introduction – Motivation– Why study (big) graphs?

• Part#1: Patterns & fraud detection• Part#2: time-evolving graphs; tensors• Conclusions

ICWE 2021

Page 10: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 10

Part 1:Patterns, &

fraud detection

Page 11: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 11

Laws and patterns• Q1: Are real graphs random?

ICWE 2021

Page 12: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 12

Laws and patterns• Q1: Are real graphs random?• A1: NO!!

– Diameter (‘6 degrees’; ‘Kevin Bacon’)– in- and out- degree distributions– other (surprising) patterns

• So, let’s look at the data

ICWE 2021

Page 13: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 13

Solution# S.1

• Power law in the degree distribution [Faloutsos x 3 SIGCOMM99]

log(rank)

log(degree)

internet domains

att.com

ibm.com

ICWE 2021

Page 14: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 14

Solution# S.1

• Power law in the degree distribution [Faloutsos x 3 SIGCOMM99]

log(rank)

log(degree)

-0.82

internet domains

att.com

ibm.com

ICWE 2021

Page 15: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

15

S2: connected component sizes• Connected Components – 4 observations:

Size

Count

Christos FaloutsosICWE 2021

1.4B nodes6B edges

Page 16: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

16

S2: connected component sizes• Connected Components

Size

Count

Christos FaloutsosICWE 2021

1) 10K x largerthan next

Page 17: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

17

S2: connected component sizes• Connected Components

Size

Count

Christos FaloutsosICWE 2021

2) ~0.7B singletonnodes

Page 18: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

18

S2: connected component sizes• Connected Components

Size

Count

Christos FaloutsosICWE 2021

3) SLOPE!

Page 19: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

19

S2: connected component sizes• Connected Components

Size

Count300-size

cmptX 500.Why?1100-size cmpt

X 65.Why?

Christos FaloutsosICWE 2021

4) Spikes!

Page 20: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

20

S2: connected component sizes• Connected Components

Size

Count

suspiciousfinancial-advice sites

(not existing now)

Christos FaloutsosICWE 2021

Page 21: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 21

Roadmap

• Introduction – Motivation• Part#1: Patterns in graphs

– P1.1: Patterns: Degree; Triangles– P1.2: Anomaly/fraud detection

• Part#2: time-evolving graphs; tensors• Conclusions

ICWE 2021

Page 22: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 22

Solution# S.3: Triangle ‘Laws’

• Real social networks have a lot of triangles

ICWE 2021

Page 23: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 23

Solution# S.3: Triangle ‘Laws’

• Real social networks have a lot of triangles– Friends of friends are friends

• Any patterns?– 2x the friends, 2x the triangles ?

ICWE 2021

Page 24: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 24

Triangle Law: #S.3 [Tsourakakis ICDM 2008]

SNReuters

Epinions X-axis: degreeY-axis: mean # trianglesn friends -> ~n1.6 triangles

ICWE 2021

Page 25: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Triangle counting for large graphs?

Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]

25ICWE 2021 25Christos Faloutsos

? ?

?

Page 26: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Triangle counting for large graphs?

Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]

26ICWE 2021 26Christos Faloutsos

Page 27: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Triangle counting for large graphs?

Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]

27ICWE 2021 27Christos Faloutsos

Page 28: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Triangle counting for large graphs?

Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]

28ICWE 2021 28Christos Faloutsos

Page 29: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Triangle counting for large graphs?

Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]

29ICWE 2021 29Christos Faloutsos

Page 30: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

MORE Graph Patterns

ICWE 2021 Christos Faloutsos 30

✔✔✔

RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos. PKDD’09.

Page 31: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

MORE Graph Patterns

ICWE 2021 Christos Faloutsos 31

• Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks. in "Social Network Data Analytics” (Ed.: Charu Aggarwal)

• Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool.

Page 32: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 32

Roadmap

• Introduction – Motivation• Part#1: Patterns in graphs

– P1.1: Patterns– P1.2: Anomaly / fraud detection

• No labels – spectral• With labels: Belief Propagation

• Part#2: time-evolving graphs; tensors• Conclusions

ICWE 2021

Patterns anomalies

Page 33: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

How to find ‘suspicious’ groups?• ‘blocks’ are normal, right?

ICWE 2021 Christos Faloutsos 33

fans

idols

Page 34: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Except that:• ‘blocks’ are normal, right?• ‘hyperbolic’ communities are more realistic

[Araujo+, PKDD’14]

ICWE 2021 Christos Faloutsos 34

Page 35: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Except that:• ‘blocks’ are usually suspicious• ‘hyperbolic’ communities are more realistic

[Araujo+, PKDD’14]

ICWE 2021 Christos Faloutsos 35

Q: Can we spot blocks, easily?

Page 36: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Except that:• ‘blocks’ are usually suspicious• ‘hyperbolic’ communities are more realistic

[Araujo+, PKDD’14]

ICWE 2021 Christos Faloutsos 36

Q: Can we spot blocks, easily?A: Silver bullet: SVD!

Page 37: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 37

N fans

Midols

‘music lovers’‘singers’

‘sports lovers’‘athletes’

‘citizens’‘politicians’

~ + +

DETAILS

Page 38: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 38

N users

Mproducts

‘meat-eaters’‘steaks’

‘vegetarians’‘plants’

‘kids’‘cookies’

~ + +

DETAILS

Page 39: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 39

~ + +

DETAILS

Mtimestamps

‘cancer’ ‘alzheimer’ ‘Parkinson’

N genes

Page 40: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 40

~ + +

DETAILS

Mtimestamps

‘hurricane’ ‘cold-spell’ ‘heat-wave’

N locations

Page 41: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 41

N fans

Midols

‘music lovers’‘singers’

‘sports lovers’‘athletes’

‘citizens’‘politicians’

~ + +

DETAILS

Page 42: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 42

N fans

Midols

‘music lovers’‘singers’

‘sports lovers’‘athletes’

‘citizens’‘politicians’

~ + +

DETAILS

Page 43: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 43

N fans

Midols

‘music lovers’‘singers’

‘sports lovers’‘athletes’

‘citizens’‘politicians’

~ + +

DETAILS

Even if shuffled!

Page 44: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Inferring Strange Behavior fromConnectivity Pattern in Social Networks

PAKDD’14

Meng Jiang, Peng Cui, Shiqiang Yang (Tsinghua)Alex Beutel, Christos Faloutsos (CMU)

Page 45: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Dataset

• Tencent Weibo• 117 million nodes (with profile and UGC

data)• 3.33 billion directed edges

ICWE 2021 45Christos Faloutsos

Page 46: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Real Data

“Pearls” “Staircase”

“Rays” “Block”

ICWE 2021 46Christos Faloutsos

Page 47: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Real Data• Spikes on the out-degree distribution

´

´ICWE 2021 47Christos Faloutsos

Page 48: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 48

Roadmap

• Introduction – Motivation• Part#1: Patterns in graphs

– P1.1: Patterns– P1.2: Anomaly / fraud detection

• No labels – spectral methods• With labels: Belief Propagation

• Part#2: time-evolving graphs; tensors• Conclusions

ICWE 2021

Page 49: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 49

E-bay Fraud detection

w/ Polo Chau &Shashank Pandit, CMU[www’07]

Page 50: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 50

E-bay Fraud detection

Page 51: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 51

E-bay Fraud detection

Page 52: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 52

E-bay Fraud detection - NetProbe

Page 53: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Popular press

And less desirable attention:• E-mail from ‘Belgium police’ (‘copy of

your code?’)ICWE 2021 Christos Faloutsos 53

Page 54: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Summary of Part#1• *many* patterns in real graphs

– Power-laws everywhere– Long (and growing) list of tools for

anomaly/fraud detection

ICWE 2021 Christos Faloutsos 54

Patterns anomalies

Page 55: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 55

Roadmap

• Introduction – Motivation• Part#1: Patterns in graphs• Part#2: time-evolving graphs

– P2.1: tools/tensors– P2.2: other patterns

• Conclusions

ICWE 2021

Page 56: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

ICWE 2021 Christos Faloutsos 56

Part 2:Time evolving graphs; tensors

Page 57: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1:

– Given who calls whom, and when– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 57

smith

johnson

Page 58: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1:

– Given who calls whom, and when– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 58

Page 59: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1:

– Given who calls whom, and when– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 59

MonTue

Page 60: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1:

– Given who calls whom, and when– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 60callee

caller

time

Page 61: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1’:

– Given author-keyword-date– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 61keyword

author

date

MANY more settings,with >2 ‘modes’

Page 62: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1’’:

– Given subject – verb – object facts– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 62object

subject

verb MANY more settings,

with >2 ‘modes’

Page 63: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Graphs over time -> tensors!• Problem #2.1’’’:

– Given <triplets>– Find patterns / anomalies

ICWE 2021 Christos Faloutsos 63mode2

mode1mod

e3MANY more settings,with >2 ‘modes’(and 4, 5, etc modes)

Page 64: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Answer : tensor factorization• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 64

N users

Mproducts

‘meat-eaters’‘steaks’

‘vegetarians’‘plants’

‘kids’‘cookies’

~ + +

Page 65: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Crush intro to SVD• Recall: (SVD) matrix factorization: finds

blocks

ICWE 2021 Christos Faloutsos 65

N fans

Midols

‘music lovers’‘singers’

‘sports lovers’‘athletes’

‘citizens’‘politicians’

~ + +

Page 66: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Answer: tensor factorization• PARAFAC decomposition

ICWE 2021 Christos Faloutsos 66

= + +subject

object

verb

politicians artists athletes

Page 67: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Answer: tensor factorization• PARAFAC decomposition• Results for who-calls-whom-when

– 4M x 15 days

ICWE 2021 Christos Faloutsos 67

= + +caller

callee

time

?? ?? ??

Page 68: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Anomaly detection in time-evolving graphs

• Anomalous communities in phone call data:– European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

1 caller 5 receivers 4 days of activity

ICWE 2021 68Christos Faloutsos

=

Page 69: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Anomaly detection in time-evolving graphs

• Anomalous communities in phone call data:– European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

1 caller 5 receivers 4 days of activity

ICWE 2021 69Christos Faloutsos

=

Page 70: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Anomaly detection in time-evolving graphs

• Anomalous communities in phone call data:– European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!ICWE 2021 70Christos Faloutsos

=

Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann,Christos Faloutsos, Prithwish Basu, Ananthram Swami,Evangelos Papalexakis, Danai Koutra. Com2: FastAutomatic Discovery of Temporal (Comet) Communities.PAKDD 2014, Tainan, Taiwan.

Page 71: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 71

Roadmap

• Introduction – Motivation• Part#1: Patterns in graphs• Part#2: time-evolving graphs

– P2.1: tools/tensors– P2.2: other patterns – inter-arrival time

• Conclusions

ICWE 2021

Page 72: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

RSC: Mining and Modeling Temporal Activity in Social Media

Alceu F. Costa* Yuto Yamaguchi Agma J. M. Traina

Caetano Traina Jr. Christos Faloutsos

Universidadede São Paulo

KDD 2015 – Sydney, Australia

*[email protected]

Page 73: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Reddit DatasetTime-stamp from comments21,198 users20 Million time-stamps

Twitter DatasetTime-stamp from tweets6,790 users16 Million time-stamps

Pattern Mining: Datasets

For each user we have: Sequence of postings time-stamps: T = (t1, t2, t3, …)Inter-arrival times (IAT) of postings: (∆1, ∆2, ∆3, …)

73t1 t2 t3 t4

∆1 ∆2 ∆3

timeICWE 2021 Christos Faloutsos

Page 74: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Human? Robots?

log

linear

ICWE 2021 Christos Faloutsos 74

Page 75: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Human? Robots?

log

linear2’ 3h 1day

ICWE 2021 Christos Faloutsos 75

Page 76: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Experiments: Can RSC-Spotter Detect Bots?

Precision vs. Sensitivity CurvesGood performance: curve close to the top

76

0 0.2 0.4 0.6 0.8 10

0.20.40.60.8

1

Sensitivity (Recall)

Prec

isio

n

0 0.2 0.4 0.6 0.8 10

0.20.40.60.8

1

Sensitivity (Recall)

Prec

isio

n

RSCïSpotter

IAT Hist.Entropy [6]Weekday Hist.

All Features

Precision > 94%Sensitivity > 70%

With strongly imbalanced datasets# humans >> # bots

Twitter

ICWE 2021 Christos Faloutsos

Page 77: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Experiments: Can RSC-Spotter Detect Bots?

Precision vs. Sensitivity CurvesGood performance: curve close to the top

77

0 0.2 0.4 0.6 0.8 10

0.20.40.60.8

1

Sensitivity (Recall)

Prec

isio

n

RSCïSpotter

IAT Hist.Entropy [6]Weekday Hist.

All Features

Precision > 96%Sensitivity > 47%With strongly imbalanced datasets# humans >> # bots

Reddit

ICWE 2021 Christos Faloutsos

Page 78: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Part 2: Conclusions

• Time-evolving / heterogeneous graphs -> tensors

• PARAFAC finds patterns• Surprising temporal patterns

ICWE 2021 78Christos Faloutsos

=

Page 79: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 79

Roadmap

• Introduction – Motivation– Why study (big) graphs?

• Part#1: Patterns in graphs• Part#2: time-evolving graphs; tensors• Acknowledgements and Conclusions

ICWE 2021

Page 80: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 80

Thanks

ICWE 2021

Thanks to: NSF IIS-0705359, IIS-0534205, CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, Google, INTEL, HP, iLab

Disclaimer: All opinions are mine; not necessarily reflecting the opinions of the funding agencies

Page 81: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 81

Cast

Akoglu, Leman

Chau, Polo

Kang, U

Hooi,Bryan

ICWE 2021

Koutra,Danai

Beutel,Alex

Papalexakis,Vagelis

Shah,Neil

Araujo,Miguel

Song,Hyun Ah

Eswaran,Dhivya

Shin,Kijung

Page 82: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 82

CONCLUSION#1 – Big data• Patterns Anomalies• Large datasets reveal patterns/outliers that

are invisible otherwise

ICWE 2021

Page 83: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 83

CONCLUSION#2 – tensors

• powerful tool

ICWE 2021

=

1 caller 5 receivers 4 days of activity

Page 84: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 84

References• D. Chakrabarti, C. Faloutsos: Graph Mining – Laws,

Tools and Case Studies, Morgan Claypool 2012• http://www.morganclaypool.com/doi/abs/10.2200/S004

49ED1V01Y201209DMK006

ICWE 2021

Page 85: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Christos Faloutsos 85

References• Danai Koutra and Christos Faloutsos, Individual and

Collective Graph Mining: Principles, Algorithms, and Applications, Morgan Claypool 2017(https://doi.org/10.2200/S00796ED1V01Y201708DMK014)

ICWE 2021

Page 86: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

TAKE HOME MESSAGE:

Cross-disciplinarity

ICWE 2021 Christos Faloutsos 86

=

Page 87: Anomaly detection in large graphs · 2021. 5. 20. · ICWE 2021 Christos Faloutsos 31 •Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks.

CMU SCS

Cross-disciplinarity

ICWE 2021 Christos Faloutsos 87

=

Thank you!