Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North...
-
Upload
monica-chapman -
Category
Documents
-
view
217 -
download
1
Transcript of Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North...
![Page 1: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/1.jpg)
Fast Computation of Database Operations using Graphics Processors
Fast Computation of Database Operations using Graphics Processors
Naga K. Govindaraju
Univ. of North Carolina
Modified By,
Mahendra Chavan for CS632
![Page 2: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/2.jpg)
GoalGoal
• Utilize graphics processors for fast computation of common database operations
![Page 3: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/3.jpg)
Motivation: Fast operationsMotivation: Fast operations
• Increasing database sizes
• Faster processor speeds but low improvement in query execution time– Memory stalls– Branch mispredictions– Resource stalls Eg. Instruction dependency
• Utilize the available architectural features and exploit parallel execution possibilities
![Page 4: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/4.jpg)
Graphics ProcessorsGraphics Processors
• Present in almost every PC
• Have multiple vertex and pixel processing engines running parallel
• Can process tens of millions of geometric primitives per second
• Peak Perf. Of GPU is increasing at the rate of 2.5-3 times a year!
• Programmable- fragment programs – executed on pixel processing engines
![Page 5: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/5.jpg)
Main ContributionsMain Contributions
• Algorithms for predicates, boolean combinations and aggregations
• Utilize SIMD capabilities of pixel processing engines
• They have used these algorithms for selection queries on one or more attributes and aggregate queries
![Page 6: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/6.jpg)
Related WorkRelated Work
• Hardware Acceleration for DB operations– Vector processors for relational DB operations
[Meki and Kambayashi 2000]– SIMD instructions for relational DB operations
[ Zhou and Ross 2002]– GPUs for spatial selections and joins [Sun et al. 2003]
![Page 7: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/7.jpg)
Graphics Processors: Design IssuesGraphics Processors: Design Issues
• Programming model is limited due to lack of random access writes– Design algorithms avoiding data rearrangements
• Programmable pipeline has poor branching– Design algorithms without branching in programmable
pipeline - evaluate branches using fixed function tests
![Page 8: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/8.jpg)
Frame BufferFrame Buffer
• Pixels stored on graphics card in a frame buffer.
• Frame buffer conceptually divided into:
• Color Buffer– Stores color component of each pixel in the frame buffer
• Depth Buffer– Stores depth value associated with each pixel. The depth is
used to determine surface visibility
• Stencil Buffer– Stores stencil value for each pixel . Called Stencil because, it
is typically used for enabling/disabling writes to frame buffer
![Page 9: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/9.jpg)
Graphics PipelineGraphics Pipeline
Vertices
Vertex Processing
Engine
Vertex Processing
Engine
Pixel processing
EngineSetupEngine Alpha TestAlpha Test
Stencil TestStencil Test
Depth TestDepth Test
![Page 10: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/10.jpg)
Graphics PipelineGraphics Pipeline
• Vertex Processing Engine – Transforms vertices to points on screen
• Setup Engine– Generates Info. For color, depth etc. associated with primitive
vertices
• Pixel processing Engines– Fragment processors, performs a series of tests before
writing the fragments to frame buffer
![Page 11: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/11.jpg)
Pixel processing EnginesPixel processing Engines
• Alpha Test– Compares fragments alpha value to user-specified reference
value
• Stencil Test– Compares fragments’ pixel’s stencil value to user-specified
reference value
• Depth Test– Compares depth value of the fragment to the reference depth
value.
![Page 12: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/12.jpg)
OperatorsOperators
• =
• <
• >
• <=
• >=
• Never
• Always
![Page 13: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/13.jpg)
Occlusion Query Occlusion Query
• Users can supply custom fragment programs on each fragment
Fragment ProgramsFragment Programs
•Gives no. of fragments that pass different no. of tests
![Page 14: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/14.jpg)
Radeon R770 GPU by AMD Graphics Product Group
![Page 15: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/15.jpg)
Data Representation on GPUsData Representation on GPUs
• Textures – 2 D arrays- may have multiple channels
• We store data in textures in floating point formats
• To perform computations on the values, render the quadrilateral, generate fragments, run fragment programs and perform tests!
![Page 16: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/16.jpg)
Stencil TestsStencil Tests
• Fragments failing Stencil test are rejected from the rasterization pipeline
• Stencil Operations– KEEP: keep the stencil value in the stencil buffer– INCR: stencil value ++– DECR: stencil value –– ZERO: stencil value = 0 – REPLACE: stencil value = reference value– INVERT: bitwise invert (stencil value)
![Page 17: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/17.jpg)
Stencil and Depth TestsStencil and Depth Tests
• We can setup the stencilOP routine as below
• For each fragment , three possible outcomes, based on the outcome, corresponding stencil op. is executed
• Op1: when a fragment fails stencil test
• Op2: when a fragment passes stencil test but fails depth test
• Op3: when a fragment passes stencil and depth test
![Page 18: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/18.jpg)
OutlineOutline
• Database Operations on GPUs
• Implementation & Results
• Analysis
• Conclusions
![Page 19: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/19.jpg)
OutlineOutline
• Database Operations on GPUs
• Implementation & Results
• Analysis
• Conclusions
![Page 20: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/20.jpg)
OverviewOverview
• Database operations require comparisons
• Utilize depth test functionality of GPUs for performing comparisons– Implements all possible comparisons <, <=, >=, >, ==, !=,
ALWAYS, NEVER
• Utilize stencil test for data validation and storing results of comparison operations
![Page 21: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/21.jpg)
Basic OperationsBasic Operations
Basic SQL query
Select A
From T
Where C
A= attributes or aggregations (SUM, COUNT, MAX etc)
T=relational table
C= Boolean Combination of Predicates (using operators AND, OR, NOT)
![Page 22: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/22.jpg)
Outline: Database OperationsOutline: Database Operations
• Predicate Evaluation– (a op constant) – depth test and stencil test– (a op b) = (a-b op 0 ) – can be executed on GPUs
• Boolean Combinations of Predicates– Express as CNF and repetitively use stencil tests
• Aggregations– Occlusion queries
![Page 23: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/23.jpg)
Outline: Database OperationsOutline: Database Operations
• Predicate Evaluation
• Boolean Combinations of Predicates
• Aggregations
![Page 24: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/24.jpg)
Basic OperationsBasic Operations
• Predicates – ai op constant or ai op aj
– Op is one of <,>,<=,>=,!=, =, TRUE, FALSE
• Boolean combinations – Conjunctive Normal Form (CNF) expression evaluation
• Aggregations – COUNT, SUM, MAX, MEDIAN, AVG
![Page 25: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/25.jpg)
Predicate EvaluationPredicate Evaluation
• ai op constant (d)
– Copy the attribute values ai into depth buffer
– Define the comparison operation using depth test– Draw a screen filling quad at depth d
![Page 26: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/26.jpg)
Screen
PIf ( ai op d )
pass fragment
Else
reject fragment
ai op d
d
![Page 27: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/27.jpg)
Predicate EvaluationPredicate Evaluation
• ai op aj
– Treat as (ai – aj) op 0
• Semi-linear queries– Defined as linear combination of attribute values compared
against a constant– Linear combination is computed as a dot product of two
vectors– Utilize the vector processing capabilities of GPUs
![Page 28: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/28.jpg)
Data ValidationData Validation
• Performed using stencil test
• Valid stencil values are set to a given value “s”
• Data values that fail predicate evaluation are set to “zero”
![Page 29: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/29.jpg)
Outline: Database OperationsOutline: Database Operations
• Predicate Evaluation
• Boolean Combinations of Predicates
• Aggregations
![Page 30: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/30.jpg)
Boolean CombinationsBoolean Combinations
• Expression provided as a CNF
• CNF is of form (A1 AND A2 AND … AND Ak)
where Ai = (Bi1 OR Bi
2 OR … OR Bimi )
• CNF does not have NOT operator– If CNF has a NOT operator, invert comparison operation to
eliminate NOT
Eg. NOT (ai < d) => (ai >= d)
![Page 31: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/31.jpg)
Boolean CombinationBoolean Combination
• We will focus on (A1 AND A2)
• All cases are considered
– A1 = (TRUE AND A1)
– If Ei = (A1 AND A2 AND … AND Ai-1 AND Ai),
Ei = (Ei-1 AND Ai)
![Page 32: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/32.jpg)
• Clear stencil value to 1
• For each Ai , i=1,….,k
• do
– if (mod(I,2)) /* Valid stencil value is 1 */• Stencil test to pass if stencil value is equal to 1• StencilOp (KEEP,KEPP, INCR)
– Else• Stencil test to pass if stencil value is equal to 2• StencilOp (KEEP,KEPP, DECR)
– Endif
– For each Bij, j=1,…..,mi– Do
• Perform Bij using COMPARE /* depth test */– End for
– If (mod(I,2)) /* valid stencil value is 2 */• If stencil value on screen is 1 , REPLACE with 0
– Else /* valid stencil value is 1 */• If stencil value on screen is 2, REPLACE with 0
– Endif
• End For
![Page 33: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/33.jpg)
A1 AND A2A1 AND A2
A1
B21
B22
B23
![Page 34: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/34.jpg)
A1 AND A2A1 AND A2
Stencil value = 1
![Page 35: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/35.jpg)
A1 AND A2A1 AND A2
A1
Stencil value = 1
![Page 36: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/36.jpg)
A1 AND A2A1 AND A2
A1
Stencil value = 0
Stencil value = 2
![Page 37: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/37.jpg)
A1 AND A2A1 AND A2
A1
St = 0
B21
St=1
B22
St=1
B23
St=1
St=0St=2
![Page 38: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/38.jpg)
A1 AND A2A1 AND A2
A1
Stencil = 0
St = 0
B21
B22
B23
St=1
St=1
St=1
![Page 39: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/39.jpg)
A1 AND A2A1 AND A2
St = 0
St=1A1 AND B2
1
St = 1A1 AND B2
2 St=1
A1 AND B23
![Page 40: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/40.jpg)
Range QueryRange Query
• Compute ai within [low, high]
– Evaluated as ( ai >= low ) AND ( ai <= high )
![Page 41: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/41.jpg)
Outline: Database OperationsOutline: Database Operations
• Predicate Evaluation
• Boolean Combinations of Predicates
• Aggregations
![Page 42: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/42.jpg)
AggregationsAggregations
• COUNT, MAX, MIN, SUM, AVG
• No data rearrangements
![Page 43: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/43.jpg)
COUNTCOUNT
• Use occlusion queries to get pixel pass count
• Syntax:– Begin occlusion query– Perform database operation– End occlusion query– Get count of number of attributes that passed database
operation
• Involves no additional overhead!
![Page 44: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/44.jpg)
MAX, MIN, MEDIANMAX, MIN, MEDIAN
• We compute Kth-largest number
• Traditional algorithms require data rearrangements
• We perform no data rearrangements, no frame buffer readbacks
![Page 45: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/45.jpg)
K-th Largest NumberK-th Largest Number
• Say vk is the k-th largest number
• How do we generate a number m equal to vk?
– Without knowing vk’s bit-representation and using comparisons
![Page 46: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/46.jpg)
Our algorithmOur algorithm
• b_max = max. no. of bits in the values in tex
• x=0
• For i= b_max-1 down to 0– Count = Compare (text >= x + 2^i)– If Count > k-1
• x=x+2^i
• Return x
![Page 47: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/47.jpg)
K-th Largest NumberK-th Largest Number
• Lemma: Let vk be the k-th largest number. Let count
be the number of values >= m
– If count > (k-1): m<= vk
– If count <= (k-1): m>vk
• Apply the earlier algorithm ensuring that count >(k-1)
![Page 48: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/48.jpg)
ExampleExample
• Vk = 11101001
• M = 00000000
![Page 49: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/49.jpg)
ExampleExample
• Vk = 11101001
• M = 10000000
• M <= Vk
![Page 50: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/50.jpg)
ExampleExample
• Vk = 11101001
• M = 11000000
• M <= Vk
![Page 51: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/51.jpg)
ExampleExample
• Vk = 11101001
• M = 11100000
• M <= Vk
![Page 52: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/52.jpg)
ExampleExample
• Vk = 11101001
• M = 11110000
• M > Vk
Make the bit 0
M = 11100000
![Page 53: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/53.jpg)
ExampleExample
• Vk = 11101001
• M = 11101000
• M <= Vk
![Page 54: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/54.jpg)
ExampleExample
• Vk = 11101001
• M = 11101100
• M > Vk
• Make this bit 0
• M = 11101000
![Page 55: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/55.jpg)
ExampleExample
• Vk = 11101001
• M = 11101010
• M > Vk
• M = 11101000
![Page 56: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/56.jpg)
ExampleExample
• Vk = 11101001
• M = 11101001
• M <= Vk
![Page 57: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/57.jpg)
ExampleExample
• Integers ranging from 0 to 255
• Represent them in depth buffer– Idea – Use depth functions to perform comparisons– Use NV_occlusion_query to determine maximum
![Page 58: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/58.jpg)
Example: Parallel MaxExample: Parallel Max• S={10,24,37,99,192,200,200,232}
• Step 1: Draw Quad at 128– S = {10,24,37,99,192,200,200,232}
• Step 2: Draw Quad at 192– S = {10,24,37,192,200,200,232}
• Step 3: Draw Quad at 224– S = {10,24,37,192,200,200,232}
• Step 4: Draw Quad at 240 – No values pass
• Step 5: Draw Quad at 232– S = {10,24,37,192,200,200,232}
• Step 6,7,8: Draw Quads at 236,234,233 – No values pass
• Max is 232
![Page 59: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/59.jpg)
SUM and AVGSUM and AVG
• Mipmaps – multi resolution textures consisting of multiple levels
• Highest level contains average of all values at lowest level
• SUM = AVG * COUNT
• Problems with mipmaps
– If we want sum of a subset of values then we have to introduce conditions in the fragment programs
– Floating point representations may have problems
![Page 60: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/60.jpg)
AccumulatorAccumulator
• Data representation is of form
• ak 2k + ak-1 2k-1 + … + a0
Sum = sum(ak) 2k+ sum(ak-1) 2k-1+…+sum(a0)
Current GPUs support no bit-masking operations
AVG = SUM/COUNT
![Page 61: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/61.jpg)
TestBitTestBit
• Read the data value from texture, say ai
• F= frac(ai/2k+1)
• If F>=0.5, then k-th bit of ai is 1
• Set F to alpha value. Alpha test passes a fragment if alpha value>=0.5
![Page 62: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/62.jpg)
OutlineOutline
• Database Operations on GPUs
• Implementation & Results
• Analysis
• Conclusions
![Page 63: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/63.jpg)
ImplementationImplementation
• Dell Precision Workstation with Dual 2.8GHz Xeon Processor
• NVIDIA GeForce FX 5900 Ultra GPU
• 2GB RAM
![Page 64: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/64.jpg)
ImplementationImplementation
• CPU – Intel compiler 7.1 with hyperthreading, multi-threading, SIMD optimizations
• GPU – NVIDIA Cg Compiler
![Page 65: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/65.jpg)
BenchmarksBenchmarks
• TCP/IP database with 1 million records and four attributes
• Census database with 360K records
![Page 66: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/66.jpg)
Copy TimeCopy Time
![Page 67: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/67.jpg)
Predicate Evaluation (3 times faster)Predicate Evaluation (3 times faster)
![Page 68: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/68.jpg)
Range Query(5.5 times faster)Range Query(5.5 times faster)
![Page 69: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/69.jpg)
Multi-Attribute Query (2 times)Multi-Attribute Query (2 times)
![Page 70: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/70.jpg)
Semi-linear Query (9 times faster)Semi-linear Query (9 times faster)
![Page 71: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/71.jpg)
COUNTCOUNT
• Same timings for GPU implementation
![Page 72: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/72.jpg)
Kth-Largest for median(2.5 times)Kth-Largest for median(2.5 times)
![Page 73: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/73.jpg)
Kth-LargestKth-Largest
![Page 74: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/74.jpg)
Kth-Largest conditionalKth-Largest conditional
![Page 75: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/75.jpg)
Accumulator(20 times slower!)Accumulator(20 times slower!)
![Page 76: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/76.jpg)
OutlineOutline
• Database Operations on GPUs
• Implementation & Results
• Analysis
• Conclusions
![Page 77: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/77.jpg)
Analysis: IssuesAnalysis: Issues
• Precision – Currently depth buffer has only 24 bit precision , inadequate
• Copy time– Copy from texture to depth buffer – no mechanism in GPU
• Integer arithmetic– Not enough arithmetic inst. In pixel processing engines
• Depth compare masking– Useful to have comparison mask for depth function
![Page 78: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/78.jpg)
Analysis: IssuesAnalysis: Issues
• Memory management– Current GPUS have 512 MB video memory, we may use the
out-of–core techniques and swap
• No random writes– No data re-arrangements possible
![Page 79: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/79.jpg)
Analysis: PerformanceAnalysis: Performance
• Relative Performance Gain– High Performance – Predicate evaluation, multi-attribute
queries, semi-linear queries, count– Medium Performance – Kth-largest number– Low Performance - Accumulator
![Page 80: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/80.jpg)
High PerformanceHigh Performance
• Parallel pixel processing engines
• Pipelining– Multi-attribute queries get advantage
• Early Depth culling– Before passing through the pixel processing engine
• Eliminate branch mispredictions
![Page 81: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/81.jpg)
Medium PerformanceMedium Performance
• Parallelism
• FX 5900 has clock speed 450MHz, 8 pixel processing engines
• Rendering single 1000x1000 quad takes 0.278ms
• Rendering 19 such quads take 5.28ms. Observed time is 6.6ms
• 80% efficiency in parallelism!!
![Page 82: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/82.jpg)
Low PerformanceLow Performance
• No gain over SIMD based CPU implementation
• Two main reasons:– Lack of integer-arithmetic– Clock rate
![Page 83: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/83.jpg)
OutlineOutline
• Database Operations on GPUs
• Implementation & Results
• Analysis
• Conclusions
![Page 84: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/84.jpg)
ConclusionsConclusions
• Novel algorithms to perform database operations on GPUs
– Evaluation of predicates, boolean combinations of predicates, aggregations
• Algorithms take into account GPU limitations– No data rearrangements– No frame buffer readbacks
![Page 85: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/85.jpg)
ConclusionsConclusions
• Preliminary comparisons with optimized CPU implementations is promising
• Discussed possible improvements on GPUs
• GPU as a useful co-processor
![Page 86: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/86.jpg)
Relational JoinsRelational Joins
• Modern GPUs have thread groups
• Each thread group have several threads
• Data Parallel primitives– Map– Scatter – scatters the Data of a relation with respect to an
array L– Gather – reverse of scatter– Split – Divides the relation into a number of disjoint partitions
with a given partitioning function
![Page 87: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/87.jpg)
NINLJNINLJ
R
S
Thread Group 1
Thread Group j
Thread Group i
Thread Group Bp
![Page 88: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/88.jpg)
INLJINLJ
• Used Cache Optimized Search Trees (CSS trees) for index structure
• Inner relation as the CSS tree
• Multiple keys are searched in parallel on the tree
![Page 89: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/89.jpg)
Sort Merge joinSort Merge join
• Merge step is done in parallel
• 3 steps– Divide relation S into Q chunks Q= ||S|| / M– Find the corresponding matching chunks from R by using the
start and end of each chunk of S– Merge each pair of S and R chunk in parallel. 1 thread group
per pair.
![Page 90: Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649eb45503460f94bbcb95/html5/thumbnails/90.jpg)
Hash joinHash join
• Partitioning– Use the Split primitive to partition both the relations
• Matching– Read the inner relation in memory relation– Each tuple from the outer relation uses sequential/binary
search on the inner relation– For binary search, the inner relation will be sorted using
bitonic sort.