Multi-scale streaming anomalies detection for time series · Streaming PCA I Dimensionality...
Transcript of Multi-scale streaming anomalies detection for time series · Streaming PCA I Dimensionality...
Multi-scale streaming anomalies detection for time seriesB Ravi Kiran
Université Lille 3, CRIStAL, Lille, France
Streaming anomaly detection
I x (t ) are observations over time t where new data arrives over time t .I Problem : Window x (t ), [t : t − p + 1] which “deviates” from normal behavior.I Motivation : How to become invariant to the window-size/scale of
pseudo-periodic structure in x (t ) ?
Streaming multi-scale Lag-matrix construction
Xpt = [xt,xt−1, . . . ,xt−p+1]
T ∈ Rp
Streaming PCA
I Dimensionality reduction for time series lag embeddingI Recursive update for principal subspaceI Linear Principal Component Analysis criterion : Given Xp ∈ RT×p, wp is
defined as 1-D projection capturing most of the energy of samples :wp = arg min‖w‖=1
∑Tt=1 ‖X
pt − (wwT )Xp
t ‖2
Multiscale streaming PCA
Algorithm 1 Streaming PCAInitialization: wj ← 0, σ 2
j ← ϵ with ϵ �1for t = 1, . . . ,T do
for j = 1, . . . , J doZ jt ← HT
2jX jt
y jt ← wTj Z
jt
σ 2j ← σ 2
j + (y jt )2
ejt ← Z jt − y
jtwj
wj ← wj + σ−2j y jte
jt
Z̃ jt ← wT
j Zjtwj
αjt ← ‖Z̃
jt − Z
jt ‖
2
end forend forreturn ααα ∈ RT×J
Given x (t ) ∈ R for t = 1 : TI We evaluate the lag-matrixX ∈ RT×p where p = 2j.I For each Xt ∈ Rp we perform a
change of basis Zt := ΦTXtI H refers to a unitary tx. that can. localize a deviation from the
local mean and variations.. Preserve the variance.I Haar transform Φ = H :
H2N =1√2
[HN ⊗ [1,1]IN ⊗ [1,−1]
]
Hierarchical (Approximation) PCA
2j−1 2j−1 2j−1 2j−1
Future
wTj−1· wT
j−1·wTj−1· wT
j−1·
wTj · wT
j ·
wTj+1·
For scale pj+1 = 2pj, a reduced lagmatrixZ j+1
t is obtained by projectionof each component of size pj on theprincipal direction obtained at thisscale, i.e. Z j+1
t = [wTj Z
jt ,w
Tj Z
jt−2j
]T
with Z 1t = X 1
t . The principal direc-tion is updated a�er this step.
Global flow
Input xt X 2j=1
t
X 2j=2
t
X 2j=Jt
Update w1
Update w2
Update wJ
α1t = ‖X
1t − X̃
1t ‖
α2t = ‖X
2t − X̃
2t ‖
α Jt = ‖X
Jt − X̃
Jt ‖
Norm‖ααα t‖
2
2nd iteration‖α̃αα t − ααα t‖
2
Least corr. scalearg minj
∑i(ααα
Tααα )ji
... ... ... ...
AUC performances across di�erent aggregation methods
Performance Evaluation : Area under the receiver operatorscharacteristics curve (AUC) (FPR vs TPR), 0 (worst) & 1 (perfect).
E�ect of 2nd iteration of Streaming PCA
Least correlated scale Vs 2nd iteration of Streaming PCA :I Least correlated scale and 2nd iteration of streaming PCA decorrelates the
reconstruction error across scales.I The least correlated scale performs be�er when there is a single scale
structure across time series. Second iteration performs be�er when there aremultiple scales.
Future work
I Understand bounds on reconstruction error ααα (t ) for Streaming PCA.I Be�er base-line by comparing with streaming covariance estimation.I Probabilistic anomaly score using gaussian neg-log score.I Recursively calculable multi-scale time series representation HTXt to model
long range dependencies.
References
I Papadimitriou et al. Optimal multi-scale pa�erns in time series streams, 2006 ACM SIGMOD
I Bin Yang, Projection approximation subspace tracking. Trans. Sigal Processing, Jan. 1995
I Evaluating real-time anomaly detection algorithms, Numenta benchmark, ICMLA 2015
https://beedotkiran.github.io/anomaly.html [email protected]