Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01...

29
Herodotos Herodotou Nedyalko Borisov Shivnath Babu Duke University

Transcript of Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01...

Page 1: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Herodotos Herodotou

Nedyalko Borisov

Shivnath Babu

Duke University

Page 2: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Table Partitioning

Split parent table into smaller child tables (partitions)

Partitioning methods: hash, range, list

Benefits

Improve query performance

Faster data loading, archival, backup

Efficient statistics maintenance

Better cardinality estimation

Fine-grained control for tuning

. . .

06/14/2011 Duke University 2

id date . . .

Sales

id date . . .

Sales_Jan_11

id date . . .

Sales_Feb_11

id date . . .

Sales_Mar_11

Page 3: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Recent Trends Growing usage due to increased data sizes

Growing user control due to new SQL extensions

Partitioning conditions for derived tables

DBA must satisfy multiple objectives and constraints regarding partitioning

Implications

DBA may have limited control over partitioning scheme

Diverse mix of partitioning schemes

06/14/2011 Duke University 3

Page 4: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Partitioning SchemesMultidimensional (Hierarchical) partitioning Tables partitioned on same key but different ranges Range partitioning with non-equi ranges

06/14/2011 Duke University 4

S11 S12 S13

S21 S22 S23

S31 S32 S33

S1 S2 S3

S.d

Sales S

S11 S12 S13

S21 S22 S23

S31 S32 S33

S41 S42 S43

S.id

S.d

20

0

40

60

80

L1

L2

L3

L4

20

0

40

60

80

L.idLoans L

P1

P3

P5

P7

P2

P4

P6

P8

20

0

40

60

80

30

10

50

70

P.idPayments P

A1 A2 A3 A4 A5

A.dAds A

Page 5: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Partition-aware Query Optimization

06/14/2011 Duke University 5

S11 S12 S13

S21 S22 S23

S31 S32 S33

S1 S2 S3

S.dSales S

S11 S12 S13

S21 S22 S23

S31 S32 S33

S41 S42 S43

S.id

S.d

20

0

40

60

80

P1

P3

P5

P7

P2

P4

P6

P8

20

0

40

60

80

30

10

50

70

P.idPayments P

P5

P7

P4

P6

P8

P1

P3

P2

S11

S21

S31

S41

S12

S22

S32

S42

S13

S23

S33

S43

S32 S33S32 S33

S42 S43

S12 S13

S22 S23

SELECT *

FROM Sales S, Payments P

WHERE S.id = P.id

AND S.d > Feb-15 AND P.id < 25

Page 6: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Partition-aware Query Optimization

06/14/2011 Duke University 6

S12

S22

S.idS.d

20

0

40

P1

P3

P2 20

0

30

10

P.idPayments PSales S

S13

S23

SELECT *

FROM Sales S, Payments P

WHERE S.id = P.id

AND S.d > Feb-15

AND P.id < 25

HJ

Union

TS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

P1: “join of unions”

MJ

IS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

P2: “union of (partition-wise) joins”

Page 7: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Partition-aware Query Optimization

More efficient partition-wise joins

More appropriate join orders

More appropriate join operators

06/14/2011 Duke University 7

MJ

IS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

TS(L1)

TS(L2)

Union

HJ

LegendS = SalesP = PaymentsL = Loans

S ⋈ P ⋈ L

IS(P3)MJ TS(L1)

TS(L2)Union

MJ

TS(S22)

TS(S23)

Union

HJ

INLJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

S ⋈ P ⋈ L

Page 8: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Problem & Challenges Problem Definition

Given a partitioning scheme and a query

find the optimal query execution plan

Challenges

Dealing with plan space explosion

Incorporating into state-of-the-art optimizers

Partitions as physical or logical properties?

Supporting a wide range of partitioning conditions

06/14/2011 Duke University 8

Page 9: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Overview

06/14/2011 Duke University 9

P1

P2

P3

P

S12S13

S22

S

S23

S.id = P.id

P1

P2

P3

S12

S13

S22

S23

1. Matching

P1

P2

P3

S12

S13

S22

S23

2. Clustering

3. Path Selection

TS(S22)TS(S23)

Union TS(P3)

HJ

TS(S22)TS(S23)

UnionTS(P1)TS(P2)

Union

MJ

Page 10: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Matching Phase Goal

Identify partition-wise join pairs that can generate output records

New data structure: Partition Index Tree (PIT)

Core idea: associate each partition with intervals

Functionalities: index intervals, efficient lookups

Implementation: Augmented red-black tree

06/14/2011 Duke University 10

Page 11: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Matching AlgorithmInputs: Table S, Table P, Join condition J

Step 1: Convert partitioning conditions to intervals

Step 2: Build PIT with intervals of S

Step 3: Probe PIT with intervals of P

Output: Partition-wise join pairs (Si, Pj)

06/14/2011 Duke University 11

Partition Interval MaxNotation:

S22 [20, 40) 40

S13 [0, 20) 40 S23 [20, 40) 40

S12 [0, 20) 40

Partition Interval MaxNotation:

S22 [20, 40) 40

S13 [0, 20) 40 S23 [20, 40) 40

S12 [0, 20) 40

P2 [10, 20)Probe

Output: Partition Join Pairs(S12, P1)(S12, P2)(S13, P1)(S13, P2)(S22, P3)(S23, P3)

Page 12: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Matching Algorithm

06/14/2011 Duke University 12

Benefits

O(n log(n)), n = number of partitions of S

Θ(n) memory needs

Build & reuse PITs multiple times

Additional support

Numeric, dates, and string ranges or lists

Complex partitioning conditions (AND, OR)

Complex join conditions (AND, OR)

Non-equi joins

Details in the paper

Page 13: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Goal

Minimize number of partition-wise join pairs

Clustering Phase

06/14/2011 Duke University 13

Input: Partition Join Pairs

(S12, P1)(S12, P2)(S13, P1)(S13, P2)(S22, P3)(S23, P3)

Output: Clustered Join Pairs

({S12,S13},{P1,P2})({S22,S23},{P3})

Intermediate:Join Partition Graph

S12 P1

P2S13

S12P3

S13

Input: Partition join pairs (output from Matching Phase)Step 1: Build bipartite join partition graphStep 2: Find connected components using Breadth-First-SearchOutput: Clustered join pairs

Page 14: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Path Creation and Selection Goal

Create and cost partition-wise join paths for child tables

06/14/2011 Duke University 14

IS(P3)MJ TS(L1)

TS(L2)Union

MJ

TS(S22)

TS(S23)

Union

HJ

INLJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

S ⋈ P ⋈ L

Page 15: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Bottom-up Query Optimization

06/14/2011 Duke University 15

LSP

Find & retain best 3-way join paths per interesting order

LS LP SP

Find & retain best 2-way join paths per interesting order

PSL

Find & retain best access paths

Page 16: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Bottom-up Query Optimization

06/14/2011 Duke University 16

SP

HJ

UnionTS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

MJ

UnionIS(P1)IS(P2)IS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

Logical Relation (Join)

Best (physical) join path

Best (physical) join path with interesting order

Page 17: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Original enumeration

Create and cost join of unions

Extended enumeration

Create and cost union of joins

Retain best path per interesting order

Not enough!

Not considering entire

plan space (e.g.,

if Pj is best, no 3-way

partition-wise joins)

Extended Enumeration

06/14/2011 Duke University 17

HJ

UnionTS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

Pj

MJ

TS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

Pu

Page 18: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Treating Partitions as Physical Properties

Interesting partitions in joins

Can make later joins less expensive

Approach

Retain best path for each interesting order

Retain best path for each interesting partition

Limitation

Not considering entire plan space (e.g., cannot create union of joins with different join orders)

06/14/2011 Duke University 18

Page 19: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Treating Partitions as Logical Properties

06/14/2011 Duke University 19

Logical child joins

Treated like logical parent joins

Retain best join paths per interesting order

LSPL1S12S13P1P2

L2S22S23P3

PP1

P2

P3

SS12S13S22S23

LL1

L2

LSL1S12S13

L2S22S23

LPL1P1P2

L2P3

SPS12S13P1P2

S22S23P3

Logical parent join

Logical child joins

Page 20: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Treating Partitions as Logical Properties

Property 1

Interesting orders independent across child joins

Property 2

Child joins can have different join orders/operators

Property 3

Entire extended plan space is enumerated

Optimality guarantee

Our bottom-up optimizer will find the optimal plan in the extended plan space

06/14/2011 Duke University 20

Page 21: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Experimental Evaluation Prototype using PostgreSQL 8.3.7

TPC-H benchmark (scale 30)

Evaluation Methodology

DBA has full/limited control over partitioning scheme

State-of-the-art Vs. Our partition-aware optimizer

Optimizer evaluation metrics

Execution time

Optimization time

Memory utilization

06/14/2011 Duke University 21

Page 22: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Evaluation: Execution Time

06/14/2011 Duke University 22

0

20

40

60

80

100

120

140

2 3 4 5 7 8 9 10 12 14

Exe

cuti

on

Tim

e (

min

)

Query

Basic AdvancedState-of-the-art Partition aware

Page 23: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Evaluation: Optimization Time

06/14/2011 Duke University 23

0

50

100

150

200

250

300

350

2 3 4 5 7 8 9 10 12 14

Op

tim

izat

ion

Tim

e (

ms)

Query

Basic/Intermediate Advanced

Page 24: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Evaluation: Memory Utilization

06/14/2011 Duke University 24

0

10

20

30

40

2 3 4 5 7 8 9 10 12 14

Me

mo

ry U

sage

(M

B)

Query

Basic AdvancedPostgreSQL Partition-aware

Page 25: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Summary Extended plan space to include plans with multiway

partition-wise joins

Developed new partition-aware optimization techniques

Easy incorporation into bottom-up query optimizers

06/14/2011 Duke University 25

Page 26: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Extensions to Parallel Databases Data placement strategy

Hash partitioning to nodes

Range/list partitioning within each node

Our extensions

Create partition-wise joins Si ⋈ Pi for each node Ni

Produce child joins for Si ⋈ Pi

06/14/2011 Duke University 26

Data placement strategy

Replicate dimension tables

Partition fact tablesaaaa

Our extensions

Further partition all tables

Produce child joins on each node

Page 27: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Evaluation: Vary Partition Size

06/14/2011 Duke University 27

0

50

100

150

200

250

300

350

64 96 128 192 256 64 96 128 192 256

Op

tim

izat

ion

Tim

e (

ms)

Partition Size (MB)

Page 28: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Evaluation: Vary Data Size

06/14/2011 Duke University 28

Page 29: Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Cardinality Estimation

06/14/2011 Duke University 29

1.E+001.E+011.E+021.E+031.E+041.E+051.E+061.E+071.E+081.E+091.E+101.E+111.E+121.E+131.E+141.E+15

Q02 Q03 Q04 Q05 Q07 Q08 Q09 Q10 Q12 Q14

Esti

mat

ed

Nu

mb

er

of

Re

cord

s

Query

Basic/Intermediate Advanced Actual