Data Management Techniques Sung-Eui Yoon KAIST URL: sungeui
-
Upload
gladys-bruce -
Category
Documents
-
view
219 -
download
0
Transcript of Data Management Techniques Sung-Eui Yoon KAIST URL: sungeui
Data Management Techniques
Sung-Eui YoonKAIST
URL: http://jupiter.kaist.ac.kr/~sungeui/
Data Avalanche (or Data Explosions)
There are too much data out data!!!
www.cs.umd.edu/class/spring2001/cmsc838b/Project/Parija_Spacco/images/
Geometric Data Avalanche
● Massive geometric data● Due to advances of modeling, simulation,
and data capture techniques
● Time-varying data (4D data sets)
CAD Model: Double Eagle Oil Tanker
82 million triangles (4 gigabyte)
CAD Model: Boeing 777
Ray Tracing Boeing 777,470 million triangles
Excerpted from SIGGRAPH course note on massive model rendering
Scanned Model: ST. Matthew Model
372 million triangles (10GB) www.cyberware.com
Possible Solutions?
● Hardware improvement will address the data avalanche?● Moore’s law: the number of transistor is
roughly double every 18 months
Current Architecture Trends
Accumulatedgrowth rate
during 1999~2009(log scale)
accessspeed
disk access speed
Data access time becomes the major computational bottleneck!
Four Orthogonal Approaches
● Cache-coherent layouts● Random-accessible compressed
meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision
detection
Overview
● Cache-coherent layouts● Random-accessible compressed
meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision
detection
Cache-Coherent Layouts of Meshes
● One dimensional data layout of a mesh● Reduce the number of cache misses
● Cache-aware or cache-oblivious layouts● Minimize the number of cache misses for
a specific or various cache parameters (e.g., cache block size)[Yoon et al. SIG05, VIS06,
Euro06]
va
vb vd
vc
va vb vd vc
One dimensional layout
Block-based I/O Model [Aggarwal and Vitter 88]
CPU or GPU
Fast memory or cache
Slow memory
Blocktransfer
Disk
1 secAccess time: 10-4 sec10-6 sec
Applications
● View-dependent meshes● View-dependent rendering
● Triangle meshes● Isocontour extractions
● Hierarchies● Ray tracing● Collision detection
View-Dependent Rendering using LODs
Improving GPU vertex cache
Utilization
GeForce 6800
(January 2005)
Applications
● View-dependent meshes● View-dependent rendering
● Triangle meshes● Isocontour extractions
● Hierarchies ● Ray tracing● Collision detection
Puget sound, 134 M triangles
Isocontourz(x,y) = 500m
Achieve up to 20X improvement on iso-
contouring
Applications
● View-dependent meshes● View-dependent rendering
● Triangle meshes● Isocontour extractions
● Hierarchies● Ray tracing● Collision detection
Achieve 30% ~ 300% performance improvement
Advantages
● General ● Works well for various applications
● Cache-oblivious● Can have benefit for all levels of the memory
hierarchy (e.g. CPU/GPU caches, memory, and disk)
● No modification of runtime applications
● Only layout computationSource codes are available as a library called
OpenCCL
Overview
● Cache-coherent layouts● Random-accessible compressed
meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision
detection
Random-Accessible Compressed Data
● Compression methods of meshes and hierarchies● Reduce the memory requirements● Supports random accesses on meshes
and hierarchies● Can be useful to many different
applications[Kim et al. Tech. Report 09;
Kim et al., TVCG 09; Yoon and Lindstrom, VIS 07]
Hierarchical-Culling oriented Compact Meshes (HCCMeshes)
● Consists of two parts:● i-HCCMeshes (in-core representation)● o-HCCMeshes (out-of-core
representation)
21
Data Access Framework
Main memory
User
Request
Data
Data pool
22
Data Access Framework- Out-of-Core Technique
Main memory
User
Request
Data
Cached data External drive
Data pool
Cluster c0
Cluster c1
Cluster c2
Cluster c3
Cluster c4
Cluster c5
…Cluster cn
cluster ID
cluster
23
HCCMeshes
Main memory
User
Request
Data
Cached data External drive
Data pool
cluster ID
Decomp.
clustercompressed
cluster
Decomp.
CompressedData
Cluster cm
Cluster c0
Cluster c1
Cluster c2
Cluster c3
Cluster c4
Cluster c5
Cluster c6
Cluster c7
Cluster c8
Cluster c9
Cluster c10
Cluster c11
Cluster c12
Cluster c13
…
o-HCCMeshi-HCCMesh
Support hierarchical random access!
24
Main Benefits
● Use a lower memory space and working set size● o-HCCMeshes have 20:1 compression
ratios● i-HCCMeshes have 6:1 compression ratios
● Improve runtime performance
25
Applications
● Whitted-style ray tracing● LOD-based ray tracing● Collision detection● Photon mapping● Non-photorealistic rendering
Source codes are available as OpenRACM
26
Results
27
Overview
● Multi-resolution representations● Random-accessible compressed
meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision
detection
28
Challenges
● Secondary rays generated show low ray coherence
● Result in low cache utilizations
● In case of ray tracing massive models, expensive cache misses occur (e.g. L1/L2, main memory)
Landscape ( >1000 M )
St.Matthew ( 372 M )
29
Goal
● Design an efficient algorithm for converting incoherent secondary rays to coherent
● Achieve a high cache coherence of these rays
● The performance improvement of ray tracing
30
Ray Reordering Framework
Camera information
Raygeneration
Rayreordering
Ray buffer
Hit points and material information
Rayprocessing
Disk
CachesL1
Main memory
Sceneinformation
[Moon et al., under review]
31
Applications
● Path tracing● Photon mapping
32
Result – Path Tracing (Video)
● 104 M triangles ● (12.8 GB)
● 512*512 resolution
● 100 path
● 8 area lights
33
Result – Photon Mapping
● 128 M triangles ● (15.7 GB)
● Cache 19% of all the data
● 4 area lights
● 13 X speedup
34
Overview
● Multi-resolution representations● Random-accessible compressed
meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision
detection
35
Collision Detection
● Collision detection is used in various fields
● Game, movie, scientific simulation and robotics
<Figure from PIXAR><Figure from C. Lauterbach >
<Figure from AION >
36
Discrete collision detection (DCD)
Discrete VS Continuous
Time step (i-1)Time step (i)
37
Continuous collision detection(CCD)
Discrete VS Continuous
Time step (i-1)Time step (i)
38
Discrete collision detection (DCD)
Discrete VS Continuous
Time step (i-1)Time step (i)
?
39
Discrete VS Continuous
Continuous CD Discrete CD
Accuracy AccurateMay miss collisions
Computation time
Expensive Very fast
40
Motivation
● Continuous collision detection● Accurate, but slow for complex models
● Hardware trend● CPUs and GPUs are increasing the # of
cores● Heterogeneous architectures
● Intel Larabee architecture
● Previous approaches● Utilize either multi-core CPUs or GPUs● Not enough performance for interactive
applications
41
Hybrid Parallel CCD [Kim et al. PG 09]
● Takes advantages of both:● Multi-core CPU architectures● GPU architectures
● Achieves interactive performance for various deforming models consisting of tens or hundreds of thousand triangles
CCDMulti-coreCPU
Multi-coreCPU
Multi-coreCPU
Multi-coreCPU
GPUGPU
GPUGPU
… …
42
Results
● Performance of HPCCD utilizing both CPUs and GPUs
Source codes are available as a library called
OpenCCD
43
Results
44
Conclusions
● Data explosion and lower growth rate of data access time
● Discussed three different techniques as a data management method● Cache-coherent layouts● Random-accessible compressed data● Cache-oblivious ray reordering● Hybrid continuous collision detection
● Applied to rendering and collision detection● Observed meaningful performance
improvement
45
Acknowledgements
● Research collaborators● TaeJoon Kim, DukSu Kim, Pio Claudio,
BooChang Moon, YongYoung Byun, JaePil Heo, SeungYong Lee, YongJin Kim, JaeHyuk Heo, John Kim, Peter Lindstrom, Valerio Pascucci, Dinesh Manocha
● Funding sources● Microsoft Research Asia● KAIST seed grant● Ministry of Knowledge Economy● Samsung● Korea Research Foundation