ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook)...
Transcript of ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook)...
![Page 1: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/1.jpg)
ANOMALY DETECTION ON STREAMING DATA
WHEN BIG DATA CHALLENGESMEET MACHINE LEARNING
PARIS BIG DATA MARCH 12TH, 2019
![Page 2: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/2.jpg)
SPORT, CINEMA, SERIES & MORE 16M SUBSCRIBERS €5.2B REVENUE
![Page 3: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/3.jpg)
40 data experts Within a group of 500 digital experts (Ekino Group)
Founded in 2009 More than 100 projects delivered
International Paris - London - Singapore
Mathematics DNA Founded by 2 mathematicians incl. a Fields Medal holder
« Since 2009, we have helped our clients to model both their strategic and operational challenges, and to solve them with tailored solutions using data and AI. »
VALUES
EXPERTISES
X
Strategy Solutions Foundations Scale
Excellence Integrity Enthusiasm Agility
![Page 4: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/4.jpg)
MFG Labs / CANAL+
4
STREAM PROCESSING
![Page 5: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/5.jpg)
MFG Labs / CANAL+
5
EVENTS
event type start, stop, ...
service SVoD, live TV, ...
timestamp
app version
platformiOS, Android, PC, ...
140M EVENTS / DAY
![Page 6: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/6.jpg)
ANOMALY DETECTION IN HOSTILE ENVIRONMENT POORLY DEFINED LOGS SEMANTICS
STREAMING ENVIRONMENT WHAT IS NORMALITY?
![Page 7: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/7.jpg)
DATA ISSUES
- Regressions - Missing events - Repeated events - Changed event data
- Incidents - Partial data loss - Total data loss
PROGRESSIVE DEPLOYMENT DOESN'T REVEAL ISSUES
MVP & ITERATE
EXPERIMENT
?
![Page 8: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/8.jpg)
DATA ISSUES
- Regressions - Missing events - Repeated events - Changed event data
- Incidents - Partial data loss - Total data loss
PROGRESSIVE DEPLOYMENT DOESN'T REVEAL ISSUES
EXPERIMENT
1. Random Cut Forest - AWS Kinesis
2. DeepAR - AWS SageMaker
3. Prophet - Facebook
![Page 9: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/9.jpg)
METHODOLOGY
KPI DEFINITION& MODELLING
PROTOTYPING DRY RUN DATA COLLECTION& ANALYSIS
BUSINESS UNDERSTANDING
![Page 10: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/10.jpg)
MFG Labs / CANAL+
10
INSIGHT #1 – FREQUENT RELEASES & MULTIPLE VERSIONS
![Page 11: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/11.jpg)
MFG Labs / CANAL+
11
log scale
App version
linear scale
INSIGHT #2 – VERSION-SPECIFIC ANOMALIES
iOS, live TV, forwardButtonPressed, hour resolution iOS, live TV, forwardButtonPressed, hour resolution
![Page 12: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/12.jpg)
MFG Labs / CANAL+
12
iOS, playerError, 2.3.5
INSIGHT #3 – #RECORDS / #DEVICES
Hour resolution
![Page 13: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/13.jpg)
Compare time serieswith similarity tests
Create a complete data dictionary
Explore the dataand learn patterns
Define an indicatorto identify anomalies
Try differentaggregation levels
ANALYSIS – KEY LEARNINGS
![Page 14: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/14.jpg)
• Pick a dimension and do a random cut • Repeat until all points are isolated
14
1. Build a forest of binary trees, each on a subset of time series 2. Use the forest to compute an anomaly score for a new point
MFG Labs / CANAL+
BUILDING A TREE SCORING A POINT
Random cut
S1
S2
A
ANOMALY SCORING – ROBUST RANDOM CUT FOREST
• Inject a new point in each tree • Measure how much the forest changes
i.e. how shallow the new cuts are
http://proceedings.mlr.press/v48/guha16.pdf
![Page 15: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/15.jpg)
15
MFG Labs / CANAL+
• Peak is correctly detected as an anomaly
• Drops are also detected
RESULTS
Android, VoD, hour resolution
![Page 16: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/16.jpg)
MFG Labs / CANAL+
16
MULTIVARIATE APPROACH DeepAR (AWS)
UNIVARIATE APPROACH Prophet (Facebook)
FORECASTING – DEEP AR, PROPHET1. Train a model to predict the time series accurately
2. Compare the real value to decide if it is an anomaly
• Use all time series as inputs • Add exogenous variables • Train a LSTM network
• Build a model for each time series • Predict with an additive regression
• Handles seasonality and trends
![Page 17: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/17.jpg)
17
• Correct prediction of daily patterns
• Better accuracy and smoothness with Prophet
• More difficult to implement than RCF on AWS
RESULTS
Android, VoD, hour resolution
MFG Labs / CANAL+
![Page 18: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/18.jpg)
18
MFG Labs / CANAL+
• Just a SQL function! • Can be called from Java (Flink) • Limited memory – sliding window
DEPLOYMENT & COLD START
![Page 19: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/19.jpg)
KEY TAKEAWAYS
1. ENGINEER A FEATURE TO IDENTIFY ANOMALIES 2. ANTICIPATE COLD START IN PRODUCTION 3. ORGANIZE ANOMALY MANAGEMENT
![Page 20: ANOMALY DETECTION ON STREAMING DATA · DeepAR (AWS) UNIVARIATE APPROACH Prophet (Facebook) FORECASTING – DEEP AR, PROPHET 1. Train a model to predict the time series accurately](https://reader034.fdocuments.net/reader034/viewer/2022042711/5f80a2f7e009f20c4568ce06/html5/thumbnails/20.jpg)
THANK YOU!
QUESTIONS?