ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science...

21
ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi http://www.cs.olemiss.edu/~ychen Joint work with Henry Bart, Xin Dang, and Hanxiang Peng

Transcript of ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science...

Page 1: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 1

Depth-Based Novelty Detection

Yixin ChenDept. of Computer and Information ScienceUniversity of Mississippihttp://www.cs.olemiss.edu/~ychen

Joint work with Henry Bart, Xin Dang, and Hanxiang Peng

Page 2: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 2

Outline

Novelty detectionMotivationsKernelized spatial depth (KSD)Bounds on the false alarm probabilityEmpirical studiesDiscussions

Page 3: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 3

Outlier Detection

Missing label problem

One-class learning

Page 4: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 4

A Simple Outlier Detector

1-d example

Sensitivity

Threshold

Structure of the data

X

mean

median

X

X

X

?

Page 5: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 5

Median

The sign function

Median is

Page 6: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 6

Spatial Median

The spatial sign function

The spatial median is

Page 7: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 7

Spatial Depth

Spatial Depth

Sample version

The expectation of the unit vector starting from x

Page 8: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 8

Spatial Depth and Outlier Detection

outlier

Page 9: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 9

Example: Half-Moon Data

FAR = 70%

Page 10: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 10

Example: Ring Data

FAR = 100%

Page 11: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 11

Kernelized Spatial Depth (KSD)

σ→∞, KSD converges to SDσ→0, KSD → 0.293

Page 12: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 12

Example: Half-Moon Data

0.2495

Page 13: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 13

Example: Ring Data

0.2651

Page 14: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 14

KSD Outlier Detector

outliers

normal observations

b is margin

How should we decide the threshold t?

Page 15: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 15

Threshold Selection

Largest threshold such that upper bound on FAP ≤ desired level

Page 16: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 16

Bounds on the False Alarm Probability

A training set bound

A test set bound

Page 17: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 17

Empirical Study 110 species under the order Cypriniforms 989 specimens from Tulane University Museum of Natural History

Page 18: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 18

Empirical Study 1

MaskingEffect

Page 19: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 19

Empirical Study 2

Page 20: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 20

Discussions

KSD outlier detection and density based approaches

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

observations

kern

el s

pa

tial d

ep

th

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

observations

est

ima

ted

pro

ba

bili

ty d

en

sity

Page 21: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen.

ICDM'07 21

Acknowledgment

Kory P. Northrop, Tulane UniversityHuimin Chen, University of New OrleansUniversity of MississippiNational Science Foundation