Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data...

31
1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and Visualization of Large Data Sets St. Vigil, 16. September 2004 Databases Visual Data Exploration Data Mining Multimedia Sim. Search High-dim. Indexing Interactive Queries NN- Queries Circle Seg. X-Tree Tree Stripping Cost Models Denclue OptiGrid HD-Eye InfoFusion Rec. Pattern Pixel Bar Charts VisualPoints VisDB CartoDraw GRADI Image SimSearch 2D Partial SimSearch Geometric 3D SimSearch VisOpt Clustering NN-Search in High-dim. DBs Visual Interfaces Car Industry Molecular Biology Medicine Image 3D-Objects CAD Telcom Marketing Monitoring Cartography Fraud Detection Finance

Transcript of Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data...

Page 1: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

1

Daniel Keim

University of Konstanz

Visual Data Mining:Problems and Applications

Summer School of the Graduiertenkolleg Exploration and Visualization of Large Data Sets

St. Vigil, 16. September 2004

Databases

Visual Data Exploration

Data Mining

MultimediaSim. Search

High-dim. Indexing

Interactive Queries

NN-Queries

Circle Seg.

X-TreeTree Stripping

Cost Models

Denclue OptiGridHD-Eye

InfoFusion

Rec. Pattern

Pixel Bar ChartsVisualPoints

VisDB

CartoDraw

GRADIImage SimSearch

2D Partial SimSearch

Geometric3D SimSearch

VisOpt

Clustering NN-Search in High-dim. DBs

Visual Interfaces

Car Industry

MolecularBiology

Medicine

Image

3D-Objects

CAD

Telcom

Marketing

Monitoring

Cartography

Fraud Detection

Finance

Page 2: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

2

Daniel A. Keim: Visual Data Mining3

Overview

1. Introduction

2. Visualization of bibliographic Data 1. InfoVis Contest2. DBLP (Trier Bibliographic Database)

3. Pixel Bar Charts1. Problem2. Solution3. Application

4. Other Problems and Applications1. SOMs of 3D Feature Vectors2. Route Visualization3. E-mail / SPAM Visualization

5. Conclusions

InfoVis Contest

Page 3: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

3

Daniel A. Keim: Visual Data Mining5

Daniel A. Keim: Visual Data Mining6

Page 4: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

4

Daniel A. Keim: Visual Data Mining7

Daniel A. Keim: Visual Data Mining8

Page 5: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

5

Daniel A. Keim: Visual Data Mining9

Coauthors of G. Robertson

Daniel A. Keim: Visual Data Mining10

Coauthors of B. Shneiderman

Page 6: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

6

Daniel A. Keim: Visual Data Mining11

Coauthors of D. Keim

DBLP –Trier Bibliographic Database

Page 7: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

7

Daniel A. Keim: Visual Data Mining13

All Profs

Keim

SaupeReiterer

Berthold

Scholl Waldvogel

Deussen Kuhlen

Leue

Brandes

Daniel A. Keim: Visual Data Mining14

Prof. U. Brandes

Page 8: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

8

Daniel A. Keim: Visual Data Mining15

Prof. M. Berthold

Daniel A. Keim: Visual Data Mining16

Prof. R. Kuhlen

Page 9: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

9

Daniel A. Keim: Visual Data Mining17

Prof. O. Deussen

Daniel A. Keim: Visual Data Mining18

Prof. D. Keim

Page 10: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

10

Daniel A. Keim: Visual Data Mining19

Prof. S. Leue

Daniel A. Keim: Visual Data Mining20

Prof. H. Reiterer

Page 11: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

11

Daniel A. Keim: Visual Data Mining21

Prof. D. Saupe

Daniel A. Keim: Visual Data Mining22

Prof. M. Scholl

Page 12: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

12

Daniel A. Keim: Visual Data Mining23

Prof. M. Waldvogel

Daniel A. Keim: Visual Data Mining24

All Profs

Keim

SaupeReiterer

Berthold

Scholl Waldvogel

Deussen Kuhlen

Leue

Brandes

Page 13: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

13

Daniel A. Keim: Visual Data Mining25

Overview

1. Introduction

2. Visualization of bibliographic Data 1. InfoVis Contest2. DBLP (Trier)

3. Pixel Bar Charts1. Problem2. Solution3. Application

4. Other Problems and Applications1. SOMs of 3D Feature Vectors2. Route Visualization3. E-mail / SPAM Visualization

5. Conclusions

Daniel A. Keim: Visual Data Mining27

Equal-Width Pixel Bar Chart

From Bar Charts to Pixel Bar Charts

Page 14: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

14

Daniel A. Keim: Visual Data Mining28

From Bar Charts to Pixel Bar Charts

Equal-Height Pixel Bar Chart

Product Type

Product Type

Daniel A. Keim: Visual Data Mining29

Multi Pixel Bar Charts

1 2 3 4 5 6 7 10 11 12

Color=quantity

Product Type

$ A

mou

nt

Product Type

Color=number of visits

1 2 3 4 5 6 7 10 11 121 2 3 4 5 6 7 10 11 12

Color=dollar amount

Product Type

low

high

From Bar Charts to Pixel Bar Charts

Page 15: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

15

Daniel A. Keim: Visual Data Mining30

Definition of Pixel Bar Charts

A pixel bar chart is defined by a five tuple

<Dx, Dy, Ox, Oy, C>

where Dx, Dy, Ox, Oy, C ∈ {Al, …, Ak,} and

• Dx / Dy are the dividing attributes on the x-/y-axes

• Ox / Oy are the ordering attributes on the x-/y-axes

• C defines the coloring attribute(s)

Daniel A. Keim: Visual Data Mining31

Definition of Pixel Bar Charts

Basic Idea of Pixel Bar Charts:

Dividing Attribute on x-Axis (e.g. Product Type)Dx

Page 16: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

16

Daniel A. Keim: Visual Data Mining32

Definition of Pixel Bar Charts

Basic Idea of Pixel Bar Charts:

Dividing Attribute on y-Axis (e.g. Region)Dx

Dy

Daniel A. Keim: Visual Data Mining33

Definition of Pixel Bar Charts

Basic Idea of Pixel Bar Charts:

Ordering Attributes on x- and y-Axes

Ox

Oy

C

Page 17: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

17

Daniel A. Keim: Visual Data Mining34

Basic Idea of Multi Pixel Bar Charts:

Multi-Pixel Bar Charts

C1 C2

C3 C4

Definition of Pixel Bar Charts

Daniel A. Keim: Visual Data Mining36

Definition of Pixel Bar Charts

Formalization of the Pixel Positioning Problem:

1. Dense Display Constraint

⎣ ⎦ ),()(:/..1,..1 jidfwithdwpjwi íí =∃=∀=∀

(equal-width Pixel Bar Chart)

⎣ ⎦ ),()(:..1,/..1 jidfwithdhjhpi íí =∃=∀=∀

(equal-height Pixel Bar Chart)

Page 18: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

18

Daniel A. Keim: Visual Data Mining37

Definition of Pixel Bar Charts

3. Locality Constraint

min)),1(),,((

))1,(),,((1

1 111

1

1

111

→+

++

∑ ∑∑ ∑

= =−−

=

=−−

w

x

h

y

w

x

h

y

yxfyxfsim

yxfyxfsim

2. No-Overlap Constraint

)()(:, jiji dfdfjiDBdd ≠⇒≠∈∀

Daniel A. Keim: Visual Data Mining38

Definition of Pixel Bar Charts

4. Ordering Constraint

xdfxdfaanji jiji ).().(:..1, 11 >⇒>∈∀

ydfydfaanji jiji ).().(:..1, 22 >⇒>∈∀

min2)).,1().,(

).,1().,((

2)).1,().,(

).1,().,((

21

21

1

1 1 21

21

11

11

1

1

1 11

11

→+−

++−

++−

++−

−−

= =−−

−−

=

=−−

∑ ∑

∑ ∑

ayxfayxf

ayxfayxf

ayxfayxf

ayxfayxf

w

x

h

y

w

x

h

y

Page 19: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

19

Daniel A. Keim: Visual Data Mining39

Pixel Placement Algorithm

1. Step:

Determine the -quantiles of Ox

and the - quantiles of Oy.

w

X1 Xw. . .

Y1

Yh

.

.

. h

⎥⎦

⎤⎢⎣

⎡ −w

ww

1,,1

⎥⎦

⎤⎢⎣

⎡ −h

hh

1,,1

Daniel A. Keim: Visual Data Mining40

Pixel Placement Algorithm

2. Step:Position bottom left corner pixel

or{ }⎭⎬⎫

⎩⎨⎧

=∈

−2

1

1 .min|)1,1( addf sXsds { }⎭⎬⎫

⎩⎨⎧

=∈

−1

1

1 .min|)1,1( addf sYsds

Position left and bottom pixels

{ } wiaddif siXsds ..1.min|)1,( 2

1 =∀⎭⎬⎫

⎩⎨⎧

=∈

{ } hjaddjf sjYsds ..1.min|),1( 1

1 =∀⎭⎬⎫

⎩⎨⎧

=∈

Page 20: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

20

Daniel A. Keim: Visual Data Mining41

Pixel Placement Algorithm

3. Step:Position remaining pixels

{ } ∅≠∩⎭⎬⎫

⎩⎨⎧

=∩∈

−jYiXifadadsumdjif ss

jYiXsds ).,.(min|),( 211

∅=∩ jYiX

jYXXd iis ∩∪∈ + )( 1

)()( 11 ++ ∪∩∪∈ jjiis YYXXd

If

consider

. . .

Daniel A. Keim: Visual Data Mining42

Application

Multi-Pixel Bar Chart for Mining 106,199 Customer Buying Transactions(Dx= product type, Dy= ⊥,Ox=dollar amount, Oy=region, C)

Product Type Product TypeProduct Type Product Type

C=Dollar amount C=No. of Visits C=QuantityC= Region

1 2 3 4 5 6 7 10 12 1 2 3 4 5 6 10 12 1 2 3 4 5 6 7 10 12 1 2 3 4 5 6 10 12

low

high

1

2

3

4

5

78

9

10

6

Page 21: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

21

Daniel A. Keim: Visual Data Mining43

Application

Multi-Pixel Bar Chart for Mining 405,000 Sales Transaction Records

(Dx= product type, Dy= ⊥, Ox=no. of visits, Oy= dollar amount, C)

customer A$345,000

Product Type Product TypeProduct Type

C=No. of Visits C=QuantityC=Dollar Amount

1 2 3 4 5 6 7 10 12 1 2 3 4 5 6 10 12 1 2 3 4 5 6 7 10 12

low

high

customer A25 visits

customer A500 items

Daniel A. Keim: Visual Data Mining44

Application

Pixel Bar Charts for mining over 150,000 E-Customer Purchasing Activities (in 1999)

(Dx= month, Dy= ⊥, Ox=no. of visits, Oy= purchase amount, C)

high

customer Ain September customer A

purchase $1,500,000customer Avisits 125times items

customer Apurchase 200items

low

high quantityhigh visits

top dollar amountcustomershighest number of customers

1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

C=Month C=Purchase Amount C= # of Visits C=Quantity

Page 22: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

22

Daniel A. Keim: Visual Data Mining45

Application

Pixel Bar Charts for Mining 106,199 IT Resource Center Customer Activities

IT Resource Center – Search http://itrc.hp.com

(Dx= Search Criteria, Dy= ⊥, Ox=No. of Keywords, Oy= Search Type, C)

Search Criteria

Typ

e

1 2 3 4 1 2 3 4 1 2 3 4

low

high

Search Criteria Search Criteria

fix/solve problem

patch search

C= Search Criteria C= Search Type C= No. of Keywords

search criteria= boolean

type = product search

Daniel A. Keim: Visual Data Mining46

ApplicationParallel Coordinate Visualization of 150,000 E-Customer Purchasing Activities

Month Purchase Amount Quantity #of Orders Region State Code Month Purchase Amount Quantity #of Orders Region State Code

Focus on one Month (June) Focus on one Purchase Amount (Medium Price)

Page 23: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

23

Daniel A. Keim: Visual Data Mining47

Overview

1. Introduction

2. Visualization of bibliographic Data 1. InfoVis Contest2. DBLP (Trier)

3. Pixel Bar Charts1. Problem2. Solution3. Application

4. Other Problems and Applications1. SOMs of 3D Feature Vectors2. Route Visualization3. E-mail / SPAM Visualization

5. Conclusions

SOMs of 3D Feature Vectors

Page 24: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

24

Daniel A. Keim: Visual Data Mining49

3D-Retrieval on CAD Database

Upper Rows: Combination of 4 Feature VectorsLower Rows: Best Single FV

Daniel A. Keim: Visual Data Mining50

• Upper Row: Depth Buffer• Middle Row: Silhouette• Lower Row: Linear Combination of Depth Buffer and Silhouette

3D-Retrieval on Internet Database

Page 25: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

25

Daniel A. Keim: Visual Data Mining51

3D-Retrieval on Internet Database

Daniel A. Keim: Visual Data Mining52

Clustering of CAD Database

Page 26: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

26

Daniel A. Keim: Visual Data Mining53

Clustering of Internet Database

Route Visualization

Page 27: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

27

Daniel A. Keim: Visual Data Mining55

Route Visualization

Daniel A. Keim: Visual Data Mining56

Route Visualization

Page 28: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

28

E-mail / SPAM Visualization

Daniel A. Keim: Visual Data Mining58

E-Mail Visualization

Page 29: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

29

Powerwall

Daniel A. Keim: Visual Data Mining60

Page 30: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

30

Daniel A. Keim: Visual Data Mining61

The EndThe End

Page 31: Visual Data Mining: Problems and Applications1 Daniel Keim University of Konstanz Visual Data Mining: Problems and Applications Summer School of the Graduiertenkolleg Exploration and

31

Fachbereich Informatik und Informations-wissenschaften

Datenbanken &Informationssysteme

Algorithmen & Datenstrukturen

Informationswissenschaft

MultimediaSignalverarbeitung

Mensch-ComputerInteraktion

Datenbanken & Visualisierung

Computergrafik &Medieninformatik

BioInformatik & Information Mining

Prof. Dr. D.A.Keim

Prof. Dr. M.Scholl

Prof. Dr. O.Deussen

Prof. Dr. R.Kuhlen

Prof. Dr. U.Brandes

Prof. Dr. D.Saupe

Prof. Dr. H.Reiterer

Prof. Dr. M.R.Berthold

Software Engineering

Computernetze

Prof. Dr. S. Leue

Prof. Dr. M. Waldvogel

Fachbereich Informatik und Informations-wissenschaften

Algorithmen & Datenstrukturen

Informationswissenschaft

Datenbanken & Visualisierung

Computergrafik &Medieninformatik

BioInformatik & Information Mining

Software engineering

Computernetzwerke

Datenbanken &Informationssysteme

MultimediaSignalverarbeitung

Mensch-ComputerInteraktion