Xu anomaly UMN 20130220 final.pptx (Read-Only) · Anomaly(detec-on(under(mul-ple(criteria...
Transcript of Xu anomaly UMN 20130220 final.pptx (Read-Only) · Anomaly(detec-on(under(mul-ple(criteria...
Anomaly detec-on under mul-ple criteria
Kevin S. Xu, 3M February 20, 2013
So?ware, Electronic, and Mechanical Systems Laboratory
Outline • Overview of 3M Computa-onal Intelligence laboratory
• Technical focus: anomaly detec-on under mul-ple criteria – Tradi-onal anomaly detec-on – Challenges introduced by mul-ple criteria – Experiment on pedestrian trajectories
So?ware, Electronic, and Mechanical Systems Laboratory
Computa-onal Intelligence laboratory • Mission: build and grow 3M businesses through innova-ve computa-onal algorithms that learn from data
• Lab members: – Brian Stankiewicz, PhD, UCLA, Cogni-ve Science – Eric Lobner, PhD candidate, Minnesota, Computer Science
– Jennifer Schumacher, PhD, Minnesota, Neuroscience – Ravi Sivalingam, PhD, Minnesota, Electrical Engineering – Guru Somasundaram, PhD, Minnesota, Computer Science – Kevin Xu, PhD, Michigan, Electrical Engineering: Systems – Anthony Sabelli, PhD, Cornell, Applied Math
So?ware, Electronic, and Mechanical Systems Laboratory
Visual AYen-on Model
So?ware, Electronic, and Mechanical Systems Laboratory
Probability of geZng aYen-on during first 3 to 5 seconds
Comparison with eye tracking
So?ware, Electronic, and Mechanical Systems Laboratory
3M VAM predic-on Eye tracking data
B. J. Stankiewicz, N. J. Anderson, and R. J. Moore (2011). Using performance efficiency for tes-ng and op-miza-on of visual aYen-on models. Proceedings of SPIE 7867, Image Quality and System Performance VIII, pp. 78670Y.
hYps://vas.3m.com/ hYp://www.youtube.com/3MVisualAYen-on
Traffic sign management
So?ware, Electronic, and Mechanical Systems Laboratory
Other technologies • We work on a variety of problems involving – Classifica-on – Anomaly detec-on – Time series models – Computer vision – Inverse problems – Ubiquitous sensing – …
So?ware, Electronic, and Mechanical Systems Laboratory
ANOMALY DETECTION UNDER MULTIPLE CRITERIA
Joint work with K.-‐J. Hsiao, J. Calder, and A. O. Hero III (University of Michigan)
So?ware, Electronic, and Mechanical Systems Laboratory
Anomaly detec-on
• Anomaly detec:on: automa-cally detec-ng significant devia-ons from nominal behavior
• Example: which one of these groups of pedestrian trajectories is anomalous?
So?ware, Electronic, and Mechanical Systems Laboratory
Anomalous trajectories Nominal trajectories
Tradi-onal anomaly detec-on • Many approaches: – Nearest neighbor-‐based – Clustering-‐based – Sta-s-cal modeling – …
• Many applica-on seZngs: – Fraud detec-on – Medical informa-cs – Detec-ng device failures or malfunc-ons – …
• We focus on unsupervised anomaly detec-on – Unlabeled training set of mostly nominal data
So?ware, Electronic, and Mechanical Systems Laboratory
Nearest-‐neighbor anomaly detec-on • Typically a variant of the following algorithm: – Training phase: • Obtain a training set of mostly nominal data samples • For each training sample, compute dissimilarity with k nearest neighboring samples
– Test phase: • For each test sample, compute dissimilarity with k nearest training samples • If dissimilarity exceeds some threshold, declare test sample to be anomalous
• Requires user to pick a single dissimilarity measure
So?ware, Electronic, and Mechanical Systems Laboratory
Mul--‐criteria anomaly detec-on • Complex data sets may require mul:ple dissimilarity measures corresponding to mul-ple criteria
• Example: pedestrian trajectories
• 2 possible criteria: – Dissimilarity in shapes of trajectories – Dissimilarity in walking speeds
So?ware, Electronic, and Mechanical Systems Laboratory
First aYempt: convex combina-ons • First aYempt at mul--‐criteria anomaly detec-on: – Scalariza:on: take convex combina-on of dissimilarity measures
– How do we choose weight in unsupervised seZng?
– Sweep over en-re range and perform tradi-onal (single-‐criteria) anomaly detec-on for each choice of weight
So?ware, Electronic, and Mechanical Systems Laboratory
Scalariza-on and Pareto fronts • Alterna-ve approach: examine Pareto fronts • A mul--‐criteria op-miza-on problem: – Given items and func-ons , select the item that minimizes
– Typically cannot simultaneously op-mize all func-ons (criteria) è no single op:mizer
– An item is Pareto-‐op:mal if no other item is superior in every criterion • No such that for all criteria • Pareto front: set of all Pareto-‐op-mal items
So?ware, Electronic, and Mechanical Systems Laboratory
Scalariza-on and Pareto fronts • Proper-es of Pareto front: – Contains all op-mizers found by scalariza-on (taking convex combina-ons)
– Contains other items that cannot be found by scalariza-on
– Scalariza-on only iden-fies items on the convex por-on of the Pareto front
So?ware, Electronic, and Mechanical Systems Laboratory
Proper-es of Pareto fronts • : Pareto front (set of Pareto-‐op-mal points) • : op-mal points iden-fied by scalariza-on • How large is ? – Assume i.i.d. samples with density that is zero outside of a bounded set
So?ware, Electronic, and Mechanical Systems Laboratory
Proper-es of Pareto fronts • : Pareto front (set of Pareto-‐op-mal points) • : op-mal points iden-fied by scalariza-on • How large is ? – Assume i.i.d. samples with density that is zero outside of a bounded set
So?ware, Electronic, and Mechanical Systems Laboratory
If is non-‐convex and sa-sfies condi-ons of Thm. 1 then for large , scalariza-on fails to iden:fy on the order of points
Proper-es of Pareto fronts • : Pareto front (set of Pareto-‐op-mal points) • : op-mal points iden-fied by scalariza-on • How large is ?
So?ware, Electronic, and Mechanical Systems Laboratory
Even if is convex, the Pareto front can s-ll be non-‐convex. For large , scalariza-on fails to iden:fy on the order of points
Pareto depth analysis (PDA) • So far we have looked at the first Pareto front • We can compute deeper Pareto fronts – Remove all points from first front – Find Pareto front on remaining points – Remove all points from second front – Find Pareto front on remaining points
– … • Pareto depth analysis (PDA) first proposed by Hero and Fleury (2004)
So?ware, Electronic, and Mechanical Systems Laboratory
A. O. Hero III and G. Fleury (2004). Pareto-‐op-mal methods for gene ranking. The Journal of VLSI Signal Processing 38(3):259–275.
Connec-on to anomaly detec-on • Q: What do Pareto fronts have to do with mul--‐criteria anomaly detec-on?
• A: We can use the Pareto front depth as “combined” dissimilarity measure – Transform dissimilari-es between samples into dyads in K-‐dimensional space
– Compute Pareto fronts on dyads
So?ware, Electronic, and Mechanical Systems Laboratory
Distribu-ons of Pareto front depths • Pareto front depths of nominal samples are shallower than those of anomalous samples
• Compute anomaly score for each test sample – Anomaly score = average depth of dyads corresponding to test sample
So?ware, Electronic, and Mechanical Systems Laboratory
40 training samples and 2 test samples
Dyads of nominal test sample o
Dyads of anomalous test sample Δ
Experiment: pedestrian trajectories • 500 training trajectories, 200 test trajectories • Trajectories have differing lengths • Use 2 criteria: – Dissimilari-es in shapes of trajectories – Dissimilari-es in walking speeds
So?ware, Electronic, and Mechanical Systems Laboratory
Comparison of results • We compare PDA to tradi-onal
anomaly detec-on with 100 uniformly spaced weights
• PDA performs slightly beYer than best weight
• PDA performs much beKer than median weight
• Best weight is unknown in prac-ce – Median weight is a beYer representa-on of tradi-onal anomaly detec-on performance
So?ware, Electronic, and Mechanical Systems Laboratory
More results • Pareto fronts of dyads are highly non-‐convex
• Recall: scalariza-on can only iden-fy convex por-on of Pareto front
So?ware, Electronic, and Mechanical Systems Laboratory
0 0.01 0.02 0.03 0.04 0.050
0.01
0.02
0.03
0.04
0.05
0.06
Walking speed dissimilarity
Shape d
issi
mila
rity
Anomalous trajectories Nominal trajectories
Summary • 3M Computa-onal Intelligence lab works on many problems that involve learning from data (including anomaly detec-on)
• Anomaly detec-on with mul-ple criteria can be performed using Pareto depth analysis (PDA) – BeYer performance than taking convex combina-ons of the mul-ple criteria
• Future work – Can we create a faster PDA algorithm by approxima-ng the Pareto fronts?
– Can we modify the PDA algorithm so it can be efficiently updated as more training data is received?
So?ware, Electronic, and Mechanical Systems Laboratory
K.-‐J. Hsiao, K. S. Xu, J. Calder, and A. O. Hero III (2012). Mul--‐criteria anomaly detec-on using Pareto depth analysis. In Advances in Neural Informa-on Processing Systems 25, pp. 854-‐862.
ADDITIONAL SLIDES
So?ware, Electronic, and Mechanical Systems Laboratory
Simula-on experiment • 300 training samples, 100 test samples • Nominal distribu-on: Uniform on the hyper cube
• Anomalous distribu-on: Differs in one anomalous dimension (uniform on )
• 4 criteria: squared differences in each dimension
So?ware, Electronic, and Mechanical Systems Laboratory
Nominal region
Anomalous region
Anomalous region
Advantages and disadvantages of PDA • Advantages – U-lizes Pareto fronts, which are superior to scalariza-on for mul--‐criteria op-miza-on
– Scales linearly in the number of criteria • Sweeping over linear combina-ons for scalariza-on using a grid search is exponen-al in
• Disadvantages – Compu-ng all Pareto fronts in training phase requires comparisons and floa-ng-‐point opera-ons (worst-‐case)
– No known efficient method to update anomaly detector as more training data is received
So?ware, Electronic, and Mechanical Systems Laboratory