R ELATIONAL A LGEBRA M ORE POINTERS FROM T UESDAY.
-
date post
19-Dec-2015 -
Category
Documents
-
view
219 -
download
2
Transcript of R ELATIONAL A LGEBRA M ORE POINTERS FROM T UESDAY.
RELATIONAL ALGEBRA
MORE POINTERS FROM TUESDAY
22
33
44
RELATIONAL ALGEBRA - DIVISION
55
66
77
Queries that give the same result as division are not replacements for division
88
DIVISION EXAMPLE
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed DBProjectTaskDatabase1Database2
Want to perform Completed/DBProject to find all students who completed all projects.
99
DIVISION EXAMPLE
Step 1) Project Completed onto it’s unique attributes
πStudent(Completed)
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
DBProjectTaskDatabase1Database2
AStudentFredEugeneSara
1010
DIVISION EXAMPLE
Step 2) Perform Cartesian product with DBProject
πStudent(Completed) × DBProject
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
DBProjectTaskDatabase1Database2
Student TaskFred Database1Fred Database2Eugene Database1Eugene Database2Sara Database1Sara Database2
Every student is combined with every task.
1111
DIVISION EXAMPLE
Step 3) Subtract Completed
πStudent(Completed) × DBProject-Completed
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
DBProjectTaskDatabase1Database2
We have the possible combinations that "could have" been, but weren't.
Student TaskFred Database1Fred Database2Eugene Database1Eugene Database2Sara Database1Sara Database2
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Student TaskEugene Database2
1212
DIVISION EXAMPLE
Step 4) Project onto the unique attributes of Completed
πStudent( πStudent(Completed) × DBProject-Completed)
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
DBProjectTaskDatabase1Database2
All students who have not completed all assignments.
Student TaskEugene Database2
Student TaskEugene Database2
1313
DIVISION EXAMPLE
Step 5) Subtract from Completed
πStudent(Completed) - πStudent( πStudent(Completed)
× DBProject-Completed)
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
DBProjectTaskDatabase1Database2
Student TaskEugene Database2
AStudentFredEugeneSara
Student TaskFred Database1Fred Database2Fred Compiler1Eugene Database1Eugene Compiler1Sara Database1Sara Database2
Completed
StudentFredSara
Completed/DBProject=
1414
Find sailors who’ve reserved a red and a green boat.
Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors):
))Re)''
((( servesBoatsredcolorsid
Tempred
sname Tempred Tempgreen Sailors(( ) )
))Re)''
((( servesBoatsgreencolorsid
Tempgreen
EXAMPLE QUERIES
1515
Find the names of sailors who’ve reserved all boats.
Uses division; schemas of the input relations must be carefully chosen:
To find sailors who’ve reserved all ‘Interlake’ boats:
Book has lots of examples.
))(/)Re,
(( Boatsbid
servesbidsid
Tempsids
sname Tempsids Sailors( )
)''
(/ BoatsInterlakebnamebid
EXAMPLE QUERIES
DATABASE SYSTEMS I
QUERY OPTIMIZATION
1717
PRINCIPLES OF QUERY OPTIMIZATION
1. Display the minimum number of fields in a query.
2. Use primary key or indexes wherever possible.
3. Use numeric rather than text primary keys.
4. Use non blank unique fields.
5. Avoid domain aggregate functions such as Dlookup().
6. Use between and equal to , rather than > or <. It will speed up the queries.
7. Use count(*) rather than count(column).
8. Short table and field names run faster than long name.
9. Normalize the tables.
10. Avoid the use of distinct row queries.
1818
QUERY OPTIMIZATION A user of a commercial DBMS formulates SQL
queries The query optimizer translates this query into
an equivalent RA query, i.e. an RA query with the same result
In order to optimize the efficiency of query processing, the query optimizer can re-order the individual operations within the RA query
Re-ordering has to preserve the query semantics and is based on RA equivalences
Just like Math operations can be reordered, so can Relational Algebra operations
1919
QUERY OPTIMIZATION Why can re-ordering improve the efficiency? Different orders can imply different sizes of
the intermediate results The smaller the intermediate results, the
more efficient Example:
much (!) more efficient than
Why?
))Re)''
((( SailorsservesBoatsredcolor
))((Re''
BoatsSailorsservesredcolor
2020
RELATIONAL ALGEBRA EQUIVALENCES
The most important RA equivalences are commutative and associative laws.
A commutative law about some operation states that the order of (two) arguments does not matter.
An associative law about some (binary) operation states that (more than two) arguments can be grouped either from the left or from the right.
If an operation is both commutative and associative, then any number of arguments can be (re-)ordered in an arbitrary manner.
2121
RELATIONAL ALGEBRA EQUIVALENCES The following (binary) RA operations are
commutative and associative: For example, we have:
Proof method: show that each tuple produced by the expression on the left is also produced by the expression on the right and vice versa.
(R S) (S R) (Commutative)
R (S T) (R S) T (Associative)
>< Ç
2222
RELATIONAL ALGEBRA EQUIVALENCES
Selections are crucial from the point of view of query optimization, because they typically reduce the size of intermediate results by a significant factor.
Laws for selections only:
RR cnccnANDANDc ...1...1
c c c cR R1 2 2 1
(Splitting)
(Commutative)
2323
RELATIONAL ALGEBRA EQUIVALENCES
Laws for the combination of selections and other operations:
if R has all attributes mentioned in c
if S has all attributes mentioned in c
The above laws can be applied to “push selections down” as much as possible in an expression, i.e. performing selections as early as possible.
SRSR cc )()(
)()( SRSR cc
2424
RELATIONAL ALGEBRA EQUIVALENCES
A projection commutes with a selection that only uses attributes retained by the projection.
Selection between attributes of the two arguments of a Cartesian product converts Cartesian product to a join.
Similarly, if a projection follows a join R S, we can ‘push’ it by retaining only attributes of R (and S) that are needed for the join or are kept by the projection.
2525
SUMMARY Several ways of expressing a given query; a
query optimizer chooses the most efficient version.
Query optimization exploits RA equivalencies to re-order the operations within an RA expression.
Optimization criterion is to minimize the size of intermediate relations.
DATABASE SYSTEMS I
RELATIONAL CALCULUS
2727
RELATIONAL CALCULUS Nonprocedural
Describes the set of answers without saying how they should be computed
Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus (DRC).
Calculus has variables, constants, comparison ops, logical connectives and quantifiers.
2828
TUPLE RELATIONAL CALCULUS Query has the form: {T | p(T)}
p(T) denotes a formula in which tuple variable T appears.
Answer is the set of all tuples T for which the formula p(T) evaluates to true.
Formula is recursively defined: start with simple atomic formulas (get tuples
from relations or make comparisons of values) build bigger and better formulas using the logical
connectives.
2929
DOMAIN RELATIONAL CALCULUS Query has the form:
Answer includes all tuples that
make the formula be true. Formula is recursively defined, starting with
simple atomic formulas (getting tuples from relations or making comparisons of values), and building bigger and better formulas using the logical connectives.
x x xn p x x xn1 2 1 2, ,..., | , ,...,
x x xn1 2, ,...,
p x x xn1 2, ,...,
3030
TRC FORMULAS
An Atomic formula is one of the following: R Rel (R is a tuple in relation Rel) R.a op S.b (comparing two fields) R.a op constant (comparing field to constant)
op is one of
A formula can be: an atomic formula where p and q are formulas where variable R is a tuple variable where variable R is a tuple variable
, , , , ,
p p q p q, ,))(( RpR))(( RpR
3131
SELECTION AND PROJECTION
Find all sailors with rating above 7{S |S Sailors S.rating > 7}
I N T A I N T A Sailors T, , , | , , ,
7
3232
JOINS
Find sailors rated > 7 who’ve reserved boat #103
{S | SSailors S.rating > 7 R(RReserves R.sid = S.sid
R.bid = 103)}
Note the use of to find a tuple in Reserves that ‘joins with’ the Sailors tuple under consideration.
I N T A I N T A Sailors T, , , | , , ,
7
Ir Br D Ir Br D serves Ir I Br, , , , Re 103
3333
UNSAFE QUERIES, EXPRESSIVE POWER
It is possible to write syntactically correct calculus queries that have an infinite number of answers! Such queries are called unsafe. e.g., S S Sailors|
3434
SUMMARY
Relational calculus is non-operational, and users define queries in terms of what they want, not in terms of how to compute it. (Declarativeness.)
Algebra and safe calculus have same expressive power, leading to the notion of relational completeness.