Anomaly detection in large graphs - Carnegie Mellon...
Transcript of Anomaly detection in large graphs - Carnegie Mellon...
![Page 1: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/1.jpg)
CMU SCS
Anomaly detection in large graphs
Christos Faloutsos CMU
www.cs.cmu.edu/~christos/TALKS/17-06-21-alibaba/faloutsos_alibaba_2017.pdf
![Page 2: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/2.jpg)
CMU SCS
Thank you!
• Annette Jiang (IEEE)
• Evan Butterfield (IEEE)
Alibaba (c) C. Faloutsos, 2017 2
![Page 3: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/3.jpg)
CMU SCS
(c) C. Faloutsos, 2017 3
Roadmap
• Introduction – Motivation – Why study (big) graphs?
• Part#1: Patterns in graphs • Part#2: time-evolving graphs; tensors • Conclusions
Alibaba
![Page 4: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/4.jpg)
CMU SCS
(c) C. Faloutsos, 2017 4
Graphs - why should we care?
>$10B; ~1B users
Alibaba
![Page 5: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/5.jpg)
CMU SCS
(c) C. Faloutsos, 2017 5
Graphs - why should we care?
Internet Map [lumeta.com]
Food Web [Martinez ’91]
Alibaba
![Page 6: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/6.jpg)
CMU SCS
(c) C. Faloutsos, 2017 6
Graphs - why should we care? • web-log (‘blog’) news propagation • computer network security: email/IP traffic and
anomaly detection • Recommendation systems • Who-bought-from-whom (ebay, Alibaba) • ....
Many-to-many db relationship -> graph Alibaba
![Page 7: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/7.jpg)
CMU SCS
Motivating problems • P1: patterns? Fraud detection?
• P2: patterns in time-evolving graphs / tensors
Alibaba (c) C. Faloutsos, 2017 7
time
destination
![Page 8: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/8.jpg)
CMU SCS
Motivating problems • P1: patterns? Fraud detection?
• P2: patterns in time-evolving graphs / tensors
Alibaba (c) C. Faloutsos, 2017 8
time
destination
Patterns anomalies
![Page 9: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/9.jpg)
CMU SCS
(c) C. Faloutsos, 2017 10
Roadmap
• Introduction – Motivation – Why study (big) graphs?
• Part#1: Patterns & fraud detection • Part#2: time-evolving graphs; tensors • Conclusions
Alibaba
![Page 10: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/10.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 11
Part 1: Patterns, &
fraud detection
![Page 11: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/11.jpg)
CMU SCS
(c) C. Faloutsos, 2017 12
Laws and patterns • Q1: Are real graphs random?
Alibaba
![Page 12: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/12.jpg)
CMU SCS
(c) C. Faloutsos, 2017 13
Laws and patterns • Q1: Are real graphs random? • A1: NO!!
– Diameter (‘6 degrees’; ‘Kevin Bacon’) – in- and out- degree distributions – other (surprising) patterns
• So, let’s look at the data
Alibaba
![Page 13: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/13.jpg)
CMU SCS
(c) C. Faloutsos, 2017 14
Solution# S.1 • Power law in the degree distribution [Faloutsos x 3
SIGCOMM99]
log(rank)
log(degree)
internet domains
att.com
ibm.com
Alibaba
![Page 14: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/14.jpg)
CMU SCS
(c) C. Faloutsos, 2017 15
Solution# S.1 • Power law in the degree distribution [Faloutsos x 3
SIGCOMM99]
log(rank)
log(degree)
-0.82
internet domains
att.com
ibm.com
Alibaba
![Page 15: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/15.jpg)
CMU SCS
16
S2: connected component sizes • Connected Components – 4 observations:
Size
Count
(c) C. Faloutsos, 2017 Alibaba
1.4B nodes 6B edges
![Page 16: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/16.jpg)
CMU SCS
17
S2: connected component sizes • Connected Components
Size
Count
(c) C. Faloutsos, 2017 Alibaba
1) 10K x larger than next
![Page 17: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/17.jpg)
CMU SCS
18
S2: connected component sizes • Connected Components
Size
Count
(c) C. Faloutsos, 2017 Alibaba
2) ~0.7B singleton nodes
![Page 18: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/18.jpg)
CMU SCS
19
S2: connected component sizes • Connected Components
Size
Count
(c) C. Faloutsos, 2017 Alibaba
3) SLOPE!
![Page 19: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/19.jpg)
CMU SCS
20
S2: connected component sizes • Connected Components
Size
Count 300-size
cmpt X 500. Why? 1100-size cmpt
X 65. Why?
(c) C. Faloutsos, 2017 Alibaba
4) Spikes!
![Page 20: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/20.jpg)
CMU SCS
21
S2: connected component sizes • Connected Components
Size
Count
suspicious financial-advice sites
(not existing now)
(c) C. Faloutsos, 2017 Alibaba
![Page 21: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/21.jpg)
CMU SCS
MORE Graph Patterns
Alibaba (c) C. Faloutsos, 2017 35
✔
✔
RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos. PKDD’09.
![Page 22: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/22.jpg)
CMU SCS
MORE Graph Patterns
Alibaba (c) C. Faloutsos, 2017 36
• Mary McGlohon, Leman Akoglu, Christos Faloutsos. Statistical Properties of Social Networks. in "Social Network Data Analytics” (Ed.: Charu Aggarwal)
• Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool.
![Page 23: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/23.jpg)
CMU SCS
(c) C. Faloutsos, 2017 37
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs
– P1.1: Patterns – P1.2: Anomaly / fraud detection
• No labels – spectral • With labels: Belief Propagation
• Part#2: time-evolving graphs; tensors • Conclusions
Alibaba
Patterns anomalies
![Page 24: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/24.jpg)
CMU SCS
How to find ‘suspicious’ groups? • ‘blocks’ are normal, right?
Alibaba (c) C. Faloutsos, 2017 38
fans
idols
![Page 25: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/25.jpg)
CMU SCS
Except that: • ‘blocks’ are normal, right? • ‘hyperbolic’ communities are more realistic
[Araujo+, PKDD’14]
Alibaba (c) C. Faloutsos, 2017 39
![Page 26: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/26.jpg)
CMU SCS
Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic
[Araujo+, PKDD’14]
Alibaba (c) C. Faloutsos, 2017 40
Q: Can we spot blocks, easily?
![Page 27: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/27.jpg)
CMU SCS
Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic
[Araujo+, PKDD’14]
Alibaba (c) C. Faloutsos, 2017 41
Q: Can we spot blocks, easily? A: Silver bullet: SVD!
![Page 28: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/28.jpg)
CMU SCS
Crush intro to SVD • Recall: (SVD) matrix factorization: finds
blocks
Alibaba (c) C. Faloutsos, 2017 42
N users
M products
‘meat-eaters’ ‘steaks’
‘vegetarians’ ‘plants’
‘kids’ ‘cookies’
~ + +
DETAILS
![Page 29: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/29.jpg)
CMU SCS
Crush intro to SVD • Recall: (SVD) matrix factorization: finds
blocks
Alibaba (c) C. Faloutsos, 2017 43
N fans
M idols
‘music lovers’ ‘singers’
‘sports lovers’ ‘athletes’
‘citizens’ ‘politicians’
~ + +
DETAILS
![Page 30: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/30.jpg)
CMU SCS
Crush intro to SVD • Recall: (SVD) matrix factorization: finds
blocks
Alibaba (c) C. Faloutsos, 2017 44
N fans
M idols
‘music lovers’ ‘singers’
‘sports lovers’ ‘athletes’
‘citizens’ ‘politicians’
~ + +
DETAILS
![Page 31: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/31.jpg)
CMU SCS
Inferring Strange Behavior from Connectivity Pattern in Social Networks
PAKDD’14
Meng Jiang, Peng Cui, Shiqiang Yang (Tsinghua) Alex Beutel, Christos Faloutsos (CMU)
![Page 32: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/32.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #0: No lockstep behavior in random
power law graph of 1M nodes, 3M edges • Random “Scatter”
Adjacency Matrix Spectral Subspace Plot
Alibaba 46 (c) C. Faloutsos, 2017 + +
![Page 33: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/33.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #1: non-overlapping lockstep • “Blocks” “Rays”
Adjacency Matrix Spectral Subspace Plot
Alibaba 47 (c) C. Faloutsos, 2017
![Page 34: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/34.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #2: non-overlapping lockstep • “Blocks; low density” Elongation
Adjacency Matrix Spectral Subspace Plot
Alibaba 48 (c) C. Faloutsos, 2017
![Page 35: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/35.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #3: non-overlapping lockstep • “Camouflage” (or “Fame”) Tilting
“Rays” Adjacency Matrix Spectral Subspace Plot
Alibaba 49 (c) C. Faloutsos, 2017
![Page 36: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/36.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #3: non-overlapping lockstep • “Camouflage” (or “Fame”) Tilting
“Rays” Adjacency Matrix Spectral Subspace Plot
Alibaba 50 (c) C. Faloutsos, 2017
![Page 37: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/37.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #4: ? lockstep • “?” “Pearls”
Adjacency Matrix Spectral Subspace Plot
?
Alibaba 51 (c) C. Faloutsos, 2017
![Page 38: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/38.jpg)
CMU SCS
Lockstep and Spectral Subspace Plot • Case #4: overlapping lockstep • “Staircase” “Pearls”
Adjacency Matrix Spectral Subspace Plot
Alibaba 52 (c) C. Faloutsos, 2017
![Page 39: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/39.jpg)
CMU SCS
Dataset
• Tencent Weibo • 117 million nodes (with profile and UGC
data) • 3.33 billion directed edges
Alibaba 53 (c) C. Faloutsos, 2017
![Page 40: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/40.jpg)
CMU SCS
Real Data
“Pearls” “Staircase”
“Rays” “Block”
Alibaba 54 (c) C. Faloutsos, 2017
![Page 41: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/41.jpg)
CMU SCS
Real Data • Spikes on the out-degree distribution
×
×Alibaba 55 (c) C. Faloutsos, 2017
![Page 42: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/42.jpg)
CMU SCS
(c) C. Faloutsos, 2017 62
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs
– P1.1: Patterns – P1.2: Anomaly / fraud detection
• No labels – spectral methods • With labels: Belief Propagation
• Part#2: time-evolving graphs; tensors • Conclusions
Alibaba
![Page 43: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/43.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 63
E-bay Fraud detection
w/ Polo Chau & Shashank Pandit, CMU [www’07]
![Page 44: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/44.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 64
E-bay Fraud detection
![Page 45: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/45.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 65
E-bay Fraud detection
![Page 46: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/46.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 66
E-bay Fraud detection - NetProbe
![Page 47: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/47.jpg)
CMU SCS
Popular press
And less desirable attention: • E-mail from ‘Belgium police’ (‘copy of
your code?’) Alibaba (c) C. Faloutsos, 2017 67
![Page 48: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/48.jpg)
CMU SCS
(c) C. Faloutsos, 2017 68
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs
– Patterns – Anomaly / fraud detection
• No labels - Spectral methods • w/ labels: Belief Propagation – closed formulas
• Part#2: time-evolving graphs; tensors • Conclusions
Alibaba
![Page 49: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/49.jpg)
CMU SCS
UnifyingGuilt-by-AssociationApproaches:TheoremsandFastAlgorithms
Danai Koutra U Kang
Hsing-Kuo Kenneth Pao
Tai-You Ke Duen Horng (Polo) Chau
Christos Faloutsos
ECML PKDD, 5-9 September 2011, Athens, Greece
![Page 50: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/50.jpg)
CMU SCS Problem Definition:
GBA techniques
(c) C. Faloutsos, 2017 70
Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects)
Alibaba
![Page 51: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/51.jpg)
CMU SCS
Are they related? • RWR (Random Walk with Restarts)
– google’s pageRank (‘if my friends are important, I’m important, too’)
• SSL (Semi-supervised learning) – minimize the differences among neighbors
• BP (Belief propagation) – send messages to neighbors, on what you
believe about them
Alibaba (c) C. Faloutsos, 2017 71
![Page 52: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/52.jpg)
CMU SCS
Are they related? • RWR (Random Walk with Restarts)
– google’s pageRank (‘if my friends are important, I’m important, too’)
• SSL (Semi-supervised learning) – minimize the differences among neighbors
• BP (Belief propagation) – send messages to neighbors, on what you
believe about them
Alibaba (c) C. Faloutsos, 2017 72
YES!
![Page 53: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/53.jpg)
CMU SCS
Correspondence of Methods
(c) C. Faloutsos, 2017 73
Method Matrix Unknown known RWR [I – c AD-1] × x = (1-c)ySSL [I + a(D - A)] × x = y
FABP [I + a D - c’A] × bh = φh
0 1 0 1 0 1 0 1 0
? 0 1 1
1 1 1
d1 d2
d3 final
labels/ beliefs
prior labels/ beliefs
adjacency matrix
Alibaba
![Page 54: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/54.jpg)
CMU SCS Results: Scalability
(c) C. Faloutsos, 2017 74
FABP is linear on the number of edges.
# of edges (Kronecker graphs)
runt
ime
(min
)
Alibaba
![Page 55: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/55.jpg)
CMU SCS
Problem: e-commerce ratings fraud • Given a heterogeneous
graph on users, products, sellers and positive/negative ratings with “seed labels”
• Find the top k most fraudulent users, products and sellers
Alibaba (c) C. Faloutsos, 2017 75
![Page 56: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/56.jpg)
CMU SCS
Problem: e-commerce ratings fraud • Given a heterogeneous
graph on users, products, sellers and positive/negative ratings with “seed labels”
• Find the top k most fraudulent users, products and sellers
Alibaba (c) C. Faloutsos, 2017 76
Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, “ZooBP: Belief Propagation for Heterogeneous Networks”, appearing in VLDB 2017
![Page 57: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/57.jpg)
CMU SCS
Problem: e-commerce ratings fraud
Alibaba (c) C. Faloutsos, 2017 77
Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, “ZooBP: Belief Propagation for Heterogeneous Networks”, appearing in VLDB 2017
![Page 58: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/58.jpg)
CMU SCS
ZooBP: features
(c) C. Faloutsos, 2017 78
Fast; convergence guarantees.
Near-perfect accuracy linear in graph size
Alibaba
Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, “ZooBP: Belief Propagation for Heterogeneous Networks”, appearing in VLDB 2017
ideal
600x (matlab) 3x (C++)
![Page 59: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/59.jpg)
CMU SCS
ZooBP in the real world
• Near 100% precision on top 300 users (Flipkart)
• Flagged users: suspicious • 400 ratings in 1 sec • 5000 good ratings and no
bad ratings
Alibaba 79 (c) C. Faloutsos, 2017
Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, “ZooBP: Belief Propagation for Heterogeneous Networks”, appearing in VLDB 2017
![Page 60: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/60.jpg)
CMU SCS
Summary of Part#1 • *many* patterns in real graphs
– Power-laws everywhere – Long (and growing) list of tools for anomaly/
fraud detection
Alibaba (c) C. Faloutsos, 2017 80
Patterns anomalies
![Page 61: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/61.jpg)
CMU SCS
(c) C. Faloutsos, 2017 81
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs • Part#2: time-evolving graphs
– P2.1: tools/tensors – P2.2: other patterns
• Conclusions
Alibaba
![Page 62: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/62.jpg)
CMU SCS
Alibaba (c) C. Faloutsos, 2017 82
Part 2: Time evolving
graphs; tensors
![Page 63: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/63.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1:
– Given who calls whom, and when – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 83
smith
![Page 64: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/64.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1:
– Given who calls whom, and when – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 84
![Page 65: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/65.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1:
– Given who calls whom, and when – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 85
Mon Tue
![Page 66: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/66.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1:
– Given who calls whom, and when – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 86 callee
caller
![Page 67: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/67.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1’:
– Given author-keyword-date – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 87 keyword
author
MANY more settings, with >2 ‘modes’
![Page 68: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/68.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1’’:
– Given subject – verb – object facts – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 88 object
subject
MANY more settings, with >2 ‘modes’
![Page 69: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/69.jpg)
CMU SCS
Graphs over time -> tensors! • Problem #2.1’’’:
– Given <triplets> – Find patterns / anomalies
Alibaba (c) C. Faloutsos, 2017 89 mode2
mode1
MANY more settings, with >2 ‘modes’ (and 4, 5, etc modes)
![Page 70: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/70.jpg)
CMU SCS
Answer : tensor factorization • Recall: (SVD) matrix factorization: finds
blocks
Alibaba (c) C. Faloutsos, 2017 90
N users
M products
‘meat-eaters’ ‘steaks’
‘vegetarians’ ‘plants’
‘kids’ ‘cookies’
~ + +
![Page 71: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/71.jpg)
CMU SCS
Answer: tensor factorization • PARAFAC decomposition
Alibaba (c) C. Faloutsos, 2017 92
= + + users
products
Meat-eaters vegetarians kids
![Page 72: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/72.jpg)
CMU SCS
Answer: tensor factorization • PARAFAC decomposition
Alibaba (c) C. Faloutsos, 2017 93
= + + users
products
user-model#1 User-model#2
User-model#3
![Page 73: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/73.jpg)
CMU SCS
Answer: tensor factorization • PARAFAC decomposition • Results for who-calls-whom-when
– 4M x 15 days
Alibaba (c) C. Faloutsos, 2017 95
= + + caller
callee
?? ?? ??
![Page 74: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/74.jpg)
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
Alibaba 96 (c) C. Faloutsos, 2017
=
![Page 75: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/75.jpg)
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
Alibaba 97 (c) C. Faloutsos, 2017
=
![Page 76: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/76.jpg)
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day!
1 caller 5 receivers 4 days of activity
Alibaba 98 (c) C. Faloutsos, 2017
=
![Page 77: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/77.jpg)
CMU SCS Anomaly detection in time-
evolving graphs
• Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks
~200 calls to EACH receiver on EACH day! Alibaba 99 (c) C. Faloutsos, 2017
=
Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.
![Page 78: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/78.jpg)
CMU SCS
(c) C. Faloutsos, 2017 100
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs • Part#2: time-evolving graphs
– P2.1: tools/tensors – P2.2: other patterns – inter-arrival time
• Conclusions
Alibaba
![Page 79: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/79.jpg)
CMU SCS
RSC: Mining and Modeling Temporal Activity in Social Media
Alceu F. Costa* Yuto Yamaguchi Agma J. M. Traina
Caetano Traina Jr. Christos Faloutsos
Universidade de São Paulo
KDD 2015 – Sydney, Australia
![Page 80: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/80.jpg)
CMU SCS
Reddit Dataset Time-stamp from comments 21,198 users 20 Million time-stamps
Twitter Dataset Time-stamp from tweets 6,790 users 16 Million time-stamps
Pattern Mining: Datasets
For each user we have: Sequence of postings time-stamps: T = (t1, t2, t3, …) Inter-arrival times (IAT) of postings: (∆1, ∆2, ∆3, …)
102 t1 t2 t3 t4
∆1 ∆2 ∆3
time Alibaba (c) C. Faloutsos, 2017
![Page 81: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/81.jpg)
CMU SCS
Pattern Mining Pattern 1: Distribution of IAT is heavy-tailed
Users can be inactive for long periods of time before making new postings
IAT Complementary Cumulative Distribution Function (CCDF) (log-log axis)
103
105 106 10710−5
100
6, IAT (seconds)
CC
DF,
P(x
* 6
)
1 day 10 days105 106 107
10−5
100
6, IAT (seconds)
CC
DF,
P(x
* 6
)
1 day10 days
Reddit Users Twitter Users Alibaba (c) C. Faloutsos, 2017
![Page 82: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/82.jpg)
CMU SCS
Pattern Mining Pattern 1: Distribution of IAT is heavy-tailed
Users can be inactive for long periods of time before making new postings
IAT Complementary Cumulative Distribution Function (CCDF) (log-log axis)
104
105 106 10710−5
100
6, IAT (seconds)
CC
DF,
P(x
* 6
)
1 day 10 days105 106 107
10−5
100
6, IAT (seconds)
CC
DF,
P(x
* 6
)
1 day10 days
Reddit Users Twitter Users Alibaba (c) C. Faloutsos, 2017
No surprises – Should we give up?
![Page 83: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/83.jpg)
CMU SCS
Human? Robots?
log
linear
Alibaba (c) C. Faloutsos, 2017 105
![Page 84: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/84.jpg)
CMU SCS
Human? Robots?
log
linear 2’ 3h 1day
Alibaba (c) C. Faloutsos, 2017 106
![Page 85: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/85.jpg)
CMU SCS Experiments: Can RSC-Spotter Detect
Bots? Precision vs. Sensitivity Curves
Good performance: curve close to the top
107
0 0.2 0.4 0.6 0.8 10
0.20.40.60.8
1
Sensitivity (Recall)
Prec
isio
n
0 0.2 0.4 0.6 0.8 10
0.20.40.60.8
1
Sensitivity (Recall)
Prec
isio
n
RSC−Spotter
IAT Hist.Entropy [6]Weekday Hist.
All Features
Precision > 94% Sensitivity > 70%
With strongly imbalanced datasets # humans >> # bots
Alibaba (c) C. Faloutsos, 2017
![Page 86: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/86.jpg)
CMU SCS Experiments: Can RSC-Spotter Detect
Bots? Precision vs. Sensitivity Curves
Good performance: curve close to the top
108
0 0.2 0.4 0.6 0.8 10
0.20.40.60.8
1
Sensitivity (Recall)
Prec
isio
n
RSC−Spotter
IAT Hist.Entropy [6]Weekday Hist.
All Features
Precision > 96% Sensitivity > 47% With strongly imbalanced datasets # humans >> # bots
Alibaba (c) C. Faloutsos, 2017
![Page 87: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/87.jpg)
CMU SCS
(c) C. Faloutsos, 2017 109
Roadmap
• Introduction – Motivation • Part#1: Patterns in graphs • Part#2: time-evolving graphs
– P2.1: tools/tensors – P2.2: other patterns
• inter-arrival time • Network growth
• Conclusions
Alibaba
![Page 88: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/88.jpg)
Beyond Sigmoids: the NetTide Model for Social Network
Growth and its Applications
Chengxi Zang 臧承熙, Peng Cui, CF
110
![Page 89: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/89.jpg)
PROBLEM: n(t) and e(t), over time?
• n(t): the number of nodes. • e(t): the number of edges. • E.g.:
– How many members will have next month? – How many friendship links will have next year?
C
C/2
0
• Linear? • Exponential? • Sigmoid?
![Page 90: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/90.jpg)
Datasets • WeChat 2011/1-2013/1 300M nodes, 4.75B links • ArXiv 1992/3-2002/3 17k nodes, 2.4M links
• Enron 1998/1-2002/7 86K nodes, 600K links
• Weibo 2006 165K nodes, 331K links
![Page 91: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/91.jpg)
A: Power Law Growth
Cumulative growth(Log-Log scale)
![Page 92: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/92.jpg)
Proposed: NetTide Model
• Nodes n(t)
• Links e(t)
Details
![Page 93: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/93.jpg)
NetTide-Node Model
• Intuition: • Rich-get-richer • Limitation • Fizzling nature
115
Details
= SI; ~Bass
#nodes(t) Total population
dn(t)
dt=
�
t✓n(t) (N � n(t))
![Page 94: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/94.jpg)
NetTide-Node Model
• Intuition: • Rich-get-richer • Limitation • Fizzling nature
116
Details
= SI; ~Bass
#nodes(t) Total population
dn(t)
dt=
�
t✓n(t) (N � n(t))
![Page 95: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/95.jpg)
NetTide-Node Model
• Intuition: • Rich-get-richer • Limitation • Fizzling nature
117
Details
= SI; ~Bass
#nodes(t) Total population
dn(t)
dt=
�
t✓n(t) (N � n(t))
![Page 96: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/96.jpg)
Results: Accuracy
![Page 97: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/97.jpg)
Results: Accuracy
![Page 98: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/98.jpg)
Results: Accuracy
120
![Page 99: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/99.jpg)
Results: Accuracy
121
![Page 100: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/100.jpg)
Results: Accuracy
122
![Page 101: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/101.jpg)
Results: Forecast WeChat from 100 million
to 300 million 730 days ahead
123
![Page 102: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/102.jpg)
CMU SCS
Part 2: Conclusions
• Time-evolving / heterogeneous graphs -> tensors
• PARAFAC finds patterns • Surprising temporal patterns (P.L. growth)
Alibaba 124 (c) C. Faloutsos, 2017
=
![Page 103: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/103.jpg)
CMU SCS
(c) C. Faloutsos, 2017 125
Roadmap
• Introduction – Motivation – Why study (big) graphs?
• Part#1: Patterns in graphs • Part#2: time-evolving graphs; tensors • Acknowledgements and Conclusions
Alibaba
![Page 104: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/104.jpg)
CMU SCS
(c) C. Faloutsos, 2017 126
Thanks
Alibaba
Thanks to: NSF IIS-0705359, IIS-0534205, CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, Google, INTEL, HP, iLab
Disclaimer: All opinions are mine; not necessarily reflecting the opinions of the funding agencies
![Page 105: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/105.jpg)
CMU SCS
(c) C. Faloutsos, 2017 127
Cast
Akoglu, Leman
Chau, Polo
Kang, U
Hooi, Bryan
Alibaba
Koutra, Danai
Beutel, Alex
Papalexakis, Vagelis
Shah, Neil
Araujo, Miguel
Song, Hyun Ah
Eswaran, Dhivya
Shin, Kijung
![Page 106: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/106.jpg)
CMU SCS
(c) C. Faloutsos, 2017 128
CONCLUSION#1 – Big data • Patterns Anomalies
• Large datasets reveal patterns/outliers that are invisible otherwise
Alibaba
![Page 107: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/107.jpg)
CMU SCS
(c) C. Faloutsos, 2017 130
CONCLUSION#2 – tensors
• powerful tool
Alibaba
=
1 caller 5 receivers 4 days of activity
![Page 108: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/108.jpg)
CMU SCS
(c) C. Faloutsos, 2017 131
References • D. Chakrabarti, C. Faloutsos: Graph Mining – Laws,
Tools and Case Studies, Morgan Claypool 2012 • http://www.morganclaypool.com/doi/abs/10.2200/
S00449ED1V01Y201209DMK006
Alibaba
![Page 109: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/109.jpg)
CMU SCS
TAKE HOME MESSAGE: Cross-disciplinarity
Alibaba (c) C. Faloutsos, 2017 132
=
![Page 110: Anomaly detection in large graphs - Carnegie Mellon …christos/TALKS/17-06-21-alibaba/faloutsos_alibaba... · Anomaly detection in large graphs Christos Faloutsos ... Alibaba (c)](https://reader031.fdocuments.in/reader031/viewer/2022022513/5aed6ded7f8b9a585f9031e4/html5/thumbnails/110.jpg)
CMU SCS
Cross-disciplinarity
Alibaba (c) C. Faloutsos, 2017 133
=
Thank you!
www.cs.cmu.edu/~christos/TALKS/17-06-21-alibaba/faloutsos_alibaba_2017.pdf