CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
-
Upload
brenda-byrd -
Category
Documents
-
view
220 -
download
0
Transcript of CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.
![Page 1: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/1.jpg)
CPS216: Advanced Database Systems
Notes 03:Query Processing (Overview, contd.)
Shivnath Babu
![Page 2: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/2.jpg)
parse
Query rewriting
Physical plan generation
execute
result
SQL query
parse tree
logical query planstatistics
physical query plan
QueryOptimization
Query Execution
Overview of Query
Processing
![Page 3: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/3.jpg)
parse
Query rewriting
Physical plan generation
execute
result
SQL query
parse tree
logical query planstatistics
physical query plan
Initial logical plan
“Best” logical plan
Logical plan
Rewrite rules
![Page 4: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/4.jpg)
Query Rewriting
B,D
R.A = “c” Λ R.C = S.C
X
R S
B,D
R.A = “c”
X
R S
R.C = S.C
We will revisit it towards the end of this lecture
![Page 5: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/5.jpg)
parse
Query rewriting
Physical plan generation
execute
result
SQL query
parse tree
Best logical query planstatistics
Best physical query plan
![Page 6: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/6.jpg)
Physical Plan Generation
B,D
R.A = “c”
R
S
Natural join
Best logical planR S
Index scan Table scan
Hash join
Project
![Page 7: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/7.jpg)
parse
Query rewriting
Physical plan generation
execute
result
SQL query
parse tree
Best logical query planstatistics
Best physical query plan
Enumerate possible physical plans
Find the cost of each plan
Pick plan with minimum cost
![Page 8: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/8.jpg)
Physical Plan Generation
Logical Query Plan
P1 P2 …. Pn
C1 C2 …. Cn
Pick minimum cost one
Physical plans
Costs
![Page 9: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/9.jpg)
Plans for Query Execution• Roadmap
– Path of a SQL query– Operator trees– Physical Vs Logical plans– Plumbing: Materialization Vs pipelining
![Page 10: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/10.jpg)
Modern DBMS Architecture
Disk(s)
Applications
OS
Parser
Query Optimizer
Query Executor
Storage Manager
Logical query plan
Physical query plan
Access method API calls
SQL
File system API callsStorage system API calls
DBMS
![Page 11: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/11.jpg)
Logical Plans Vs. Physical Plans
B,D
R.A = “c”
R
S
Natural join
Best logical planR S
Index scan Table scan
Hash join
Project
![Page 12: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/12.jpg)
B,D
R.A = “c”
R
S
Operator Plumbing
• Materialization: output of one operator written to disk, next operator reads from the disk
• Pipelining: output of one operator directly fed to next operator
![Page 13: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/13.jpg)
B,D
R.A = “c”
R
S
Materialization
Materialized here
![Page 14: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/14.jpg)
B,D
R.A = “c”
R
S
Iterators: Pipelining
Each operator supports:• Open()• GetNext()• Close()
![Page 15: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/15.jpg)
Iterator for Table Scan (R)
Open() { /** initialize variables */ b = first block of R; t = first tuple in block b;}
GetNext() { IF (t is past last tuple in block b) { set b to next block; IF (there is no next block) /** no more tuples */ RETURN EOT; ELSE t = first tuple in b; } /** return current tuple */ oldt = t; set t to next tuple in block b; RETURN oldt;}
Close() { /** nothing to be done */}
![Page 16: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/16.jpg)
Iterator for Select
Open() { /** initialize child */ Child.Open();}
GetNext() { LOOP: t = Child.GetNext(); IF (t == EOT) { /** no more tuples */ RETURN EOT; } ELSE IF (t.A == “c”) RETURN t; ENDLOOP:}
Close() { /** inform child */ Child.Close();}
R.A = “c”
![Page 17: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/17.jpg)
Iterator for Sort
Open() { /** Bulk of the work is here */ Child.Open(); Read all tuples from Child and sort them}
GetNext() { IF (more tuples) RETURN next tuple in order; ELSE RETURN EOT;}
Close() { /** inform child */ Child.Close();}
R.A
![Page 18: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/18.jpg)
• TNLJ (conceptually)
for each r Lexp do
for each s Rexp do
if Lexp.C = Rexp.C, output r,s
Iterator for Tuple Nested Loop Join
Lexp Rexp
![Page 19: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/19.jpg)
Example 1: Left-Deep Plan
R1(A,B)
TableScan
R2(B,C)
TableScan
R3(C,D)
TableScan
TNLJ
TNLJ
Question: What is the sequence of getNext() calls?
![Page 20: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/20.jpg)
Example 2: Right-Deep Plan
R3(C,D)
TableScan
TNLJ
R1(A,B)
TableScan
R2(B,C)
TableScan
TNLJ
Question: What is the sequence of getNext() calls?
![Page 21: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/21.jpg)
Example
Worked on blackboard
![Page 22: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/22.jpg)
Cost Measure for a Physical Plan
• There are many cost measures– Time to completion– Number of I/Os (we will see a lot of this)– Number of getNext() calls
• Tradeoff: Simplicity of estimation Vs. Accurate estimation of performance as seen by user
![Page 23: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/23.jpg)
Textbook outline
Chapter 15
15.1 Physical operators- Scan, Sort (Ch. 11.4), Indexes (Ch. 13)
15.2-15.6 Implementing operators +
estimating their cost
15.8 Buffer Management
15.9 Parallel Processing
![Page 24: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/24.jpg)
Chapter 16
16.1 Parsing
16.2 Algebraic laws
16.3 Parse tree logical query plan
16.4 Estimating result sizes
16.5-16.7 Cost based optimization
Textbook outline (contd.)
![Page 25: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/25.jpg)
Chapter 5 Relational Algebra
Chapter 6 SQL
Background Material
![Page 26: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/26.jpg)
parse
Query rewriting
Physical plan generation
execute
result
SQL query
parse tree
logical query planstatistics
physical query plan
Query Processing - In class order
2; 16.1
3; 16.2,16.3
1; 13, 15
4; 16.4—16.7
First: A quick look at this
![Page 27: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/27.jpg)
Why do we need Query Rewriting?
• Pruning the HUGE space of physical plans– Eliminating redundant conditions/operators– Rules that will improve performance with very
high probability
• Preprocessing– Getting queries into a form that we know how
to handle best
Reduces optimization time drastically without noticeably affecting quality
![Page 28: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/28.jpg)
Some Query Rewrite Rules
• Transform one logical plan into another– Do not use statistics
• Equivalences in relational algebra
• Push-down predicates
• Do projects early
• Avoid cross-products if possible
![Page 29: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/29.jpg)
Equivalences in Relational Algebra
R S = S R Commutativity
(R S) T = R (S T) Associativity
Also holds for: Cross Products, Union, Intersection
R x S = S x R
(R x S) x T = R x (S x T)
R U S = S U R
R U (S U T) = (R U S) U T
![Page 30: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/30.jpg)
Apply Rewrite Rule (1)
B,D [ R.C=S.C [R.A=“c”(R X S)]]
B,D
R.A = “c” Λ R.C = S.C
X
R S
B,D
R.A = “c”
X
R S
R.C = S.C
![Page 31: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/31.jpg)
Rules: Project
Let: X = set of attributes
Y = set of attributes
XY = X U Y
xy (R) = x [y (R)]
![Page 32: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/32.jpg)
Let p = predicate with only R attribs
q = predicate with only S attribs
m = predicate with only R,S attribs
p (R S) =
q (R S) =
Rules: combined
[p (R)] S
R [q (S)]
![Page 33: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/33.jpg)
Apply Rewrite Rule (2)
B,D [ R.C=S.C [R.A=“c”(R)] X S]
B,D
R.A = “c”
X
R
S
R.C = S.C
B,D
R.A = “c”
X
R S
R.C = S.C
![Page 34: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/34.jpg)
Apply Rewrite Rule (3)
B,D [[R.A=“c”(R)] S]
B,D
R.A = “c”
R
S
B,D
R.A = “c”
X
R
S
R.C = S.CNatural join
![Page 35: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/35.jpg)
Rules: combined (continued)
pq (R S) = [p (R)] [q (S)]
pqm (R S) =
m [(p R) (q S)]pvq (R S) =
[(p R) S] U [R (q S)]
![Page 36: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/36.jpg)
p1p2 (R) p1 [p2 (R)]
p (R S) [p (R)] S
R S S R
x [p (R)] x {p [xz (R)]}
Which are “good” transformations?
![Page 37: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/37.jpg)
Conventional wisdom: do projects early
Example: R(A,B,C,D,E) P: (A=3) (B=“cat”)
E {p (R)} vs. E {p{ABE(R)}}
![Page 38: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/38.jpg)
But: What if we have A, B indexes?
B = “cat” A=3
Intersect pointers to get
pointers to matching tuples
![Page 39: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/39.jpg)
Bottom line:
• No transformation is always good
• Some are usually good: – Push selections down– Avoid cross-products if possible– Subqueries Joins
![Page 40: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/40.jpg)
Avoid Cross Products (if possible)
• Which join trees avoid cross-products?• If you can't avoid cross products, perform
them as late as possible
Select B,DFrom R,S,T,UWhere R.A = S.B R.C=T.C R.D = U.D
![Page 41: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/41.jpg)
More Query Rewrite Rules
• Transform one logical plan into another– Do not use statistics
• Equivalences in relational algebra• Push-down predicates• Do projects early• Avoid cross-products if possible• Use left-deep trees• Subqueries Joins • Use of constraints, e.g., uniqueness
![Page 42: CPS216: Advanced Database Systems Notes 03:Query Processing (Overview, contd.) Shivnath Babu.](https://reader038.fdocuments.net/reader038/viewer/2022110400/56649daf5503460f94a9d5e5/html5/thumbnails/42.jpg)
Announcements• First homework will be posted next week
– I will send an email announcement when it is posted
• We will start discussing past projects and project ideas from next week– Your first goal is to have a concretely defined project by
the last week of September (the earlier the better!)– Small class single student projects– October and November should be used to work on the
projects– Final “demo’s” will be in the first week of December