Dynamic Visualization of Transient Data Streams

23
Dynamic Visualization of Transient Data Streams P. Wong, et al The Pacific Northwest National Laboratory Presented by John Sharko Visualization of Massive Datasets

description

Dynamic Visualization of Transient Data Streams. P. Wong, et al The Pacific Northwest National Laboratory Presented by John Sharko Visualization of Massive Datasets. Characteristics of Data Streams. Arrives continuously Arrives unpredictably Arrives unboundedly - PowerPoint PPT Presentation

Transcript of Dynamic Visualization of Transient Data Streams

Page 1: Dynamic Visualization of Transient Data Streams

Dynamic Visualization of Transient Data Streams

P. Wong, et alThe Pacific Northwest National Laboratory

Presented by John SharkoVisualization of Massive Datasets

Page 2: Dynamic Visualization of Transient Data Streams

Characteristics of Data Streams

• Arrives continuously

• Arrives unpredictably

• Arrives unboundedly

• Arrives without persistent patterns

Page 3: Dynamic Visualization of Transient Data Streams

Examples of Data Streams

• Newswires

• Internet click streams

• Network resource management

• Phone call records

• Remote sensing imagery

Page 4: Dynamic Visualization of Transient Data Streams

Visualization Problem

• Fusing a large amount of previously analyzed information with a small amount of new information

• Reprocess the whole dataset in full detail

Page 5: Dynamic Visualization of Transient Data Streams

First Objective

• Achieve the best understanding of transient data when influx rate exceed processing rate

Approach: Data stratification to reduce data size

Page 6: Dynamic Visualization of Transient Data Streams

Second Objective

• Incremental visualization technique

Approach: Project new information incrementally onto previous data

Page 7: Dynamic Visualization of Transient Data Streams

Primary Visualization OutputMultidimensional Scaling

OJ Simpson trial

French elections

Oklahoma bombing

Page 8: Dynamic Visualization of Transient Data Streams

Adaptive Visualization Using Stratification

Page 9: Dynamic Visualization of Transient Data Streams

Methods for Adaptive Visualization

• Vector dimension reduction

• Vector sampling

Page 10: Dynamic Visualization of Transient Data Streams

Vector Dimension Reduction

Approach: dyadic wavelets (Haar)

200 terms

100 terms

50 terms

Page 11: Dynamic Visualization of Transient Data Streams

Results of Vector Dimension Reduction

200 10050

Dimensions

Page 12: Dynamic Visualization of Transient Data Streams

Results of Vector Sampling

3298 1649 824

Number of Documents

Page 13: Dynamic Visualization of Transient Data Streams

Scatterplot Similarity Matching

Page 14: Dynamic Visualization of Transient Data Streams

Scatterplot Similarity Matching

Procrustes Analysis Results

200 100 50

All 0.0 (self) 0.022 0.084

1/2 0.016 0.051 0.111

1/4 0.033 0.062 0.141

Page 15: Dynamic Visualization of Transient Data Streams

Incremental Visualization Using Fusion

• Reprocessing by projecting new items onto existing visualization

• Feature: reprocessing the entire dataset is often not required

Page 16: Dynamic Visualization of Transient Data Streams

Hyperspectral Image Processing

• Apply MDS to scale pixel vectors

• K-mean process to assign unique colors

• Stratify the vectors progressively

Page 17: Dynamic Visualization of Transient Data Streams

Robust Eigenvectors

Generate three MDS scatter plots for each third of the image

Page 18: Dynamic Visualization of Transient Data Streams

Robust Eigenvectors (cont’d)Generate MDS scatterplot for entire dataset

Page 19: Dynamic Visualization of Transient Data Streams

Robust Eigenvectors (cont’d)

Extract points from cropped areas

Page 20: Dynamic Visualization of Transient Data Streams

Using Multiple Sliding Windows

Eigenvectors determined by the long window

New vectors are projected using the Eigenvectors of the long window

Data Stream

Long Window Short Window

Sliding Direction

Page 21: Dynamic Visualization of Transient Data Streams

Dynamic Visualization Steps

1. When influx rate < processing rate, use MDS

2. When influx rate > processing rate, halt MDS

3. Use multiple sliding windows for pre-defined number of steps

4. Use stratification approach for fast overview

5. Check for accumulated error using Procrustes analysis

6. If error threshold not reached, go to step 3

If error threshold reached, go to step 1

Page 22: Dynamic Visualization of Transient Data Streams

Conclusions

• The data stratification approach can substantially accelerate visualization process

• The data fusion approach can provide instant updates

Page 23: Dynamic Visualization of Transient Data Streams