8/13/2019 Query Optimization From KCD
1/20
Query Optimization
Dr. Karen C. Davis
Professor
School of Electronic and Computing SystemsSchool of Computing Sciences and Informatics
8/13/2019 Query Optimization From KCD
2/20
8/13/2019 Query Optimization From KCD
3/20
Outline
overview of relational query optimization
logicaloptimization algebraic equivalences
transformation of trees physicaloptimization
selection algorithms
join algorithms
cost-based optimization research example using relational algebra
8/13/2019 Query Optimization From KCD
4/20
Relational Query Optimization
query optimizer
logical physicalSQL query
relational algebra
query tree
access plan(executable)
8/13/2019 Query Optimization From KCD
5/20
Learning Outcomes
translate basic SQL to RA query tree
perform heuristic optimizations to tree
use cost-based optimization to select algorithms
for tree operators to generate an execution plan
8/13/2019 Query Optimization From KCD
6/20
SQL is declarative
describes what data, not how to retrieve it
select distinct from
where
helpful for users, not necessarily good for
efficient execution
8/13/2019 Query Optimization From KCD
7/20
Relational Algebra is procedural
specifies operatorsand the order of evaluation
steps for query evaluation:
1. translate SQL to RA operators (query tree)
2. perform heuristic optimizations:
a. push RA selectoperators down the tree
b. convert selectand cross product tojoinc. others based on algebraic transformations
8/13/2019 Query Optimization From KCD
8/20
Relational Algebra Operators
name symbolically evaluation
select cR applies condition c to R
project lR keeps a list (l) of attributes of R
crossproduct
R XS all possible combinations of tuplesof Rare appended with tuples
from S
join RcS l(c (R X S)), where lis a list ofattributes of Rand Swith duplicate
columns removed and cis a joincondition
8/13/2019 Query Optimization From KCD
9/20
SQL to RA
select distinct lfrom x
where c
l|
c|X
/ \
R S
l
|
c|
X
/ \X S/ \
R T
l|
c|
X
/ \
X S
/ \X T
/ \R U
two relations
three relations
four relations
8/13/2019 Query Optimization From KCD
10/20
SQL to RA Tree Example
select A.x, A.y, B.z
from A, B
where A.a = B.z and A.x > 10
A.x, A.y, B.z|
A.a = B.z and A.z > 10
|X
/ \
A B
evaluated bottom-up left to right;intermediate values are passed up
the tree to the next operator
8/13/2019 Query Optimization From KCD
11/20
SQL to RA Tree Example
select lname
from employee, works_on, projects
where pname = Aquarius and
pnumber = pno andessn = ssn and
bdate = 1985-12-03
lname|
pname = Aquarius andpnumber = pno and
essn = ssn andbdate = 1985-12-03
|
X
/ \
X projects/ \
employee works_on
8/13/2019 Query Optimization From KCD
12/20
Simple Heuristic Optimization
1. cascade selects (split them up)
l
|
c1andc2and c3|
X/ \
R S
l|
c1
|c2|
c3
|X
/ \
R S
8/13/2019 Query Optimization From KCD
13/20
2. Push any single attribute selects down the
tree to be just above their relation
l|
c1|
c2|
c3|
X
/ \
R S
l|
c2|
X
/ \
c1 c3| |R S
8/13/2019 Query Optimization From KCD
14/20
3. Convert 2-attribute select and cross product
to join
l|
c2|
X
/ \
c1 c3| |R S
l|
c2
/ \
c1 c3
| |R Ssmaller
intermediate
results
efficient join
algorithms
8/13/2019 Query Optimization From KCD
15/20
Practice problem: optimize RA tree
select P.pnumber, P.dnum, E.lname, E.bdate
from projects P, department D, employee Ewhere D.dnumber = P.dnum and // c1
D.mgrssn = E.ssn and // c2
P.plocation = Stafford; // c3
8/13/2019 Query Optimization From KCD
16/20
RA tree to RA expression
l|
c2
/ \
c1 c3| |
R S
c1R c3Sc2l( )
8/13/2019 Query Optimization From KCD
17/20
Other Operators in Relational Algebra
SQL:(select pnumber from projects, department, employee
where dnum = dnumber and mgrssn = ssn
and lname = 'Smith)
union
(select pnumber from projects, works_on, employee
where pnumber = pno and essn = ssnand lname = 'Smith');
RA:
pnumber(lname = Smith employee
ssn=mgrssn departmentdnumber = dnumprojects)
pnumber(lname = Smith employee ssn=essn works_on
pnumber = pnoprojects)
8/13/2019 Query Optimization From KCD
18/20
Selection
Algorithms linear search
binary search
primary index or hash for point query
primary index for range query
clustering index
secondary index
conjunctives
individual index composite index or hash
intersection of record pointers for multiple indexes
8/13/2019 Query Optimization From KCD
19/20
Join Algorithms
nested loop join
single-scan join
sort-merge join
hash join
http://docs.oracle.com/cd/E13085_01/doc/timesten.1121/e14261/query.htm
sort-mergeusing
indexes
example execution plan
8/13/2019 Query Optimization From KCD
20/20
Multiple View Processing Plan (MVPP)
view chromosome:101100010100001
index chromosome:
1100110
Fitness: sum of queryprocessing costs ofindividual queriesusing the views and
indexes selected
orderkey
(v7)
Customer (C) Orders (O) Lineitem (L) Nation (N) Part (P)
Q1 Q2 Q3
O.orderkey,
O.shippriority
(v9)
C.custkey, C.name,
C.acctbal, N.name,C.address, C.phone
(v12)
P.type,
L.extendedprice
(v15)
C.mktsegment =
building
and L.shipdate = 1995-
03-15 (v8)
O.orderdate = 1994-10-
01
(v11)
L.shipdate = 1995-
09-01
(v14)
nationkey
(v10)
custkey
(v6)partkey
(v13)
name, address,phone, acctbal,nationkey, custkey,
mktsegment (v1)
orderkey,
orderdate, custkey,shippriority
(v2)
partkey, orderkey,
shipdate,extendedprice
(v3)
nationkey,
name
(v4)
partkey,
type(v5)
thesis defense of Sirisha Machiraju: Space Allocation for Materialized Views
and Indexes Using Genetic Algorithms, June 2002
Top Related