Tree-Based Density Clustering using Graphics Processors

25
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Tree-Based Density Clustering using Graphics Processors Evan Samanas and Ben Welton A First Marriage of MRNet and GPUs

description

Tree-Based Density Clustering using Graphics Processors. A First Marriage of MRNet and GPUs. Evan Samanas and Ben Welton. The Tweet Stream. v. v. `. `. Source: Twitter, Map: About.com. FE. CP. CP. CP. CP. CP. CP. …. …. …. BE. BE. BE. BE. app. app. app. app. app. - PowerPoint PPT Presentation

Transcript of Tree-Based Density Clustering using Graphics Processors

Page 1: Tree-Based Density Clustering using Graphics Processors

Paradyn Project

Paradyn / Dyninst WeekCollege Park, Maryland

March 26-28, 2012

Tree-Based Density Clustering using Graphics Processors

Evan Samanas and Ben WeltonA First Marriage of MRNet and GPUs

Page 2: Tree-Based Density Clustering using Graphics Processors

The Tweet Stream

2Tree-Based Density Clustering using Graphics Processors

`

`

vv

Source: Twitter, Map: About.com

Page 3: Tree-Based Density Clustering using Graphics Processors

3Tree-Based Density Clustering using Graphics Processors

Tree-Based Overlay Networks (TBONs)

o Scalable multicast

o Scalable gather

o Scalable data aggregation

FE

…… …BE

appappappapp

BE

appappappapp

BE

appappappapp

BE

appappappapp

CP CP

CP CP CP CP

Page 4: Tree-Based Density Clustering using Graphics Processors

4Tree-Based Density Clustering using Graphics Processors

MRNet – Multicast / Reduction Networko General-purpose

TBON APIo Network: user-defined

topologyo Stream: logical data channel

o to a set of back-endso multicast, gather, and custom

reductiono Packet: collection of datao Filter: stream data operator

o synchronizationo transformation

o Widely adopted by HPC toolso CEPBA toolkit o Cray ATP & CCDBo Open|SpeedShop & CBTFo STATo TAU

FE

…… …BE

appappappapp

BE

appappappapp

BE

appappappapp

BE

appappappapp

CP CP

CP CP CP CP

F(x1,…,xn)

Page 5: Tree-Based Density Clustering using Graphics Processors

TBON Computation

5Tree-Based Density Clustering using Graphics Processors

FE

BE

appappappapp

BE

appappappapp

BE

appappappapp

CP CP

BE

appappappapp

Ideal Characteristics:o Filter output size constant or decreasing

o Computation rate similar across levels

o Adjustable for load balance

Data Size: 10MB per BE

Packet Size: ≤10 MB

Packet Size:≤10 MB

~10 sec

~40 sec

…4x

~10 sec

~10 sec

~10 sec

…Data to

process: e.g. 40 MBTotal Time: ~30

secTotal Time: ~60

sec

Page 6: Tree-Based Density Clustering using Graphics Processors

Why GPUs?

6Tree-Based Density Clustering using Graphics Processors

FE

…… …BE

appappappapp

BE

appappappapp

BE

appappappapp

CP CP

CP CP CP CP

BE

appappappapp

o Natural fito Increase compute powero Trade computation for bandwidtho Derived summarieso Compute and send ∆ data

o Compression algorithms (e.g. LZO, zlib, etc.)

F(x1,…,xn)

Page 7: Tree-Based Density Clustering using Graphics Processors

Goal: Find regions that meet minimum density and spatial

distance characteristics

The two parameters that determine if a point is in a cluster

is Epsilon (Eps), and MinPtsIf the number of points in Eps is > MinPts, the point is a core point.For every discovered point, this

same calculation is performed until the cluster is fully expanded

Clustering Example (DBSCAN[1])

7Tree-Based Density Clustering using Graphics Processors

EpsMin Points

Min Pts: 3

[1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996)

Page 8: Tree-Based Density Clustering using Graphics Processors

Scaling DBSCANo PDBSCAN[2]

oQuality equivalent to single DBSCANo Linear speedup up to 8 nodes

o DBDC[3]

o Sacrifices qualityo~30x speedup on 15 nodes

o CUDA-Dclust[4]

oQuality equivalent to DBSCANo~15x faster on 1 node

8Tree-Based Density Clustering using Graphics Processors

[2] X. Xu et. al., A fast Parallel Clustering Algorithm for Large Spatial Databases (1999)[3] E. Januzaj et. al., DBDC: Density Based Distributed Clustering (2004)[4] C. Bohm et al., Density-based clustering using graphics processors (2009)

Page 9: Tree-Based Density Clustering using Graphics Processors

Tree-Based Clustering: Mr. Scan

9Tree-Based Density Clustering using Graphics Processors

FE

BE BE BE

CP CP

BE

DBSCAN

Algorithm Steps

SpatialDecomp: CPU(@ FE)

DBSCAN: CPU or GPU(@ BE)

DrawBoundBox: CPU or GPU

MergeCluster: CPU (x #levels)

MergeCluster

Page 10: Tree-Based Density Clustering using Graphics Processors

Spatial Decomposition

10Tree-Based Density Clustering using Graphics Processors

1. Start with an input of Spatially Referenced points

2. Partition the region into equal sized density regions across one dimension

3. Add the shadow region area of one Epsilon to all density regions

Partition #1 Partition #2 Partition #3

Eps

Page 11: Tree-Based Density Clustering using Graphics Processors

DBSCAN - CPUo Run on local slice

for each BEo R* treeo Start at random

pointo Cluster w/ respect

to Eps, MinPtso Complexity: O(n

log n)11Tree-Based Density Clustering using Graphics

Processors

A B

C D E

R* Tree Example

Page 12: Tree-Based Density Clustering using Graphics Processors

GPU DBSCAN Filter

12Tree-Based Density Clustering using Graphics Processors

Candidate #1

Candidate #2

.

.Candidate #N

Candidate Clusters are potential clusters being explored concurrently in

the GPU

Number of cluster candidates is limited by GPU Characteristics

State array stores the current state of points in the search space

States

GPU DBSCAN operates similarly to the CPU version with two exceptions

Coarse grain search space

Chain Collisions causing cluster merges

CUDA-DCLUST [09 – Böhm] 

Page 13: Tree-Based Density Clustering using Graphics Processors

DrawBoundBox – CPU | | GPU

13Tree-Based Density Clustering using Graphics Processors

Page 14: Tree-Based Density Clustering using Graphics Processors

MergeBoundBox - CPUo Checks for merge if box within

shadowo At least one point MUST be in

commono Iterate through ALL points in right

cluster

14Tree-Based Density Clustering using Graphics Processors

Match!

Page 15: Tree-Based Density Clustering using Graphics Processors

The Tweet Stream

15Tree-Based Density Clustering using Graphics Processors

`

`

Source: Twitter, Map: About.com

Page 16: Tree-Based Density Clustering using Graphics Processors

Evaluationo Dataset: 1-3 “Tweet Days”o Measuring:

o Time to completionoQuality compared to single-threaded

DBSCANo Algorithms:

o Single-Threaded DBSCANoDBDCoMRNet w/DBSCAN filteroMRNet w/DBSCAN GPU filter

16Tree-Based Density Clustering using Graphics Processors

Page 17: Tree-Based Density Clustering using Graphics Processors

Results

One Day Three Day0

5

10

15

20

25

30

ELKISingle ThreadCPU - MR. ScanGPU - MR. Scan

Tim

e (H

ours

)

17Tree-Based Density Clustering using Graphics Processors

Page 18: Tree-Based Density Clustering using Graphics Processors

Results

1x4 1x16 1x2x160

5000

10000

15000

20000

25000

DecompGPU RunCPU Run

Topology

Runn

ing

Tim

e (s

ec)

18Tree-Based Density Clustering using Graphics Processors

Page 19: Tree-Based Density Clustering using Graphics Processors

Results

19Tree-Based Density Clustering using Graphics Processors

Page 20: Tree-Based Density Clustering using Graphics Processors

Quality

20Tree-Based Density Clustering using Graphics Processors

0 10 20 30 40 50 6070%

75%

80%

85%

90%

95%

100%Quality of 1 tweet day

Mr. Scan w/CPU; 0 in-ternal nodesMr. Scan w/CPU; 2 in-ternal nodesMr. Scan w/GPU; 0 in-ternal nodesMr. Scan w/GPU; 2 in-ternal nodesDBDC

# Backend Nodes

Qua

lity

Page 21: Tree-Based Density Clustering using Graphics Processors

Future Worko Scaling Issues

o Spatial DecompositionoMerging Algorithm

21Tree-Based Density Clustering using Graphics Processors

Page 22: Tree-Based Density Clustering using Graphics Processors

2D Spatial Decomposition

22Tree-Based Density Clustering using Graphics Processors

- 1D Spatial Decomposition has some severe limitations

7pts 11pts

- Splits can have wildly differing point counts- Number of splits limited by Epsilon

Eps

- 2D Spatial Decomposition would allow for more Splits with more equal point counts

Page 23: Tree-Based Density Clustering using Graphics Processors

Merging Algorithmo Bounding Box

oNo quality degradation

o Limits iteration, but still iterates

23Tree-Based Density Clustering using Graphics Processors

oPossible Alternative: Concave HulloDBSCAN on border

pointsoAvoids iterationoConstant output

sizeoO(n2) on BEs

Page 24: Tree-Based Density Clustering using Graphics Processors

Use Caseso Twitter Data

o “Flu Tweets”oMood/Topic clusteringoRiot prediction

o Any Spatial dataoCurrently limited to

2D

24Tree-Based Density Clustering using Graphics Processors

Page 25: Tree-Based Density Clustering using Graphics Processors

Conclusiono DBSCAN performance scaleso Quality able to be maintainedo Goal: Scale O(100,000 nodes)

25Tree-Based Density Clustering using Graphics Processors