ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION
Transcript of ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION
9/28/2010
1
ITCS 3160DATA BASE DESIGN AND IMPLEMENTATION
JING YANG2010 FALL
Class 11: The Relational Algebra and Relational Calculus (2)
Operations of Relational Algebra2
9/28/2010
2
Operations of Relational Algebra (cont’d.)
3
Notation for Query Trees
Query treeR h i l i f l f d f
4
Represents the input relations of query as leaf nodes of the treeRepresents the relational algebra operations as internal nodes
9/28/2010
3
5
Additional Relational Operations
Generalized projectionAll f i f ib b i l d d i h
6
Allows functions of attributes to be included in the projection list
Aggregate functions and groupingCommon functions applied to collections of numeric
l values Include SUM, AVERAGE, MAXIMUM, and MINIMUM
9/28/2010
4
Additional Relational Operations (cont’d.)
Group tuples by the value of some of their attrib tes
7
attributes Apply aggregate function independently to each group
Recursive Closure Operations
Operation applied to a recursive relationshipbetween t ples of same t pe
8
between tuples of same type
Question: what is the purpose of the following expression?
What if we want to find all employees supervised by James Borg?
9/28/2010
5
OUTER JOIN Operations
Outer joinsK ll l i R ll h i S ll h i b h
9
Keep all tuples in R, or all those in S, or all those in both relations regardless of whether or not they have matching tuples in the other relationTypes• LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER
JOIN
OUTER JOIN Operations
Example:
10
9/28/2010
6
The OUTER UNION Operation
Take union of tuples from two relations that have some common attrib tes
11
some common attributesNot union (type) compatible
Partially compatibleAll tuples from both relations included in the resultTuples with the same value combination will appear only once
Summary
The relational model has rigorously defined query languages that are simple and powerful
12
languages that are simple and powerful.Relational algebra is more operational; useful as internal representation for query evaluation plans.Several ways of expressing a given query; a query optimizer should choose the most efficient version.
9/28/2010
7
RELATIONAL CALCULUS
Relational Calculus
Declarative expression S if i l d l l
14
Specify a retrieval request; nonprocedural language
Any retrieval that can be specified in basic relational algebra
Can also be specified in relational calculus
9/28/2010
8
First-Order Predicate Logic
Propositional logic is concerned only with sentential connectives such as and or not
15
connectives such as and, or, not.Proposition: a statement that affirms or denies somethingPeter is tall
First order predicate logic additionally covers predicates and quantification
Predicates16
Example: { | }{x | x is a positive integer less than 4} is the set {1,2,3}.An element of the set {x | P(x)}, is an object t for which the statement P(t) is true. With such statements, P(x) is referred to as the Predicatemaking x the subject of the proposition.
9/28/2010
9
Quantification17
All known human languages make use of q antificationquantification.For example, in English:
Every student in my class is smart. There was somebody in my class that was able to correctly answer every one of the questions I gave.
'Most of the people I talked to didn't have a clue who the candidates were.
Relational Calculus
Comes in two flavors: Tuple relational calculus (TRC) and D i l ti l l l (DRC)
18
Domain relational calculus (DRC).Calculus has variables, constants, comparison ops, logical connectives and quantifiers.
TRC: Variables range over (i.e., get bound to) tuples.DRC: Variables range over domain elements (= field values).B th TRC d DRC i l b t f fi t d di t l iBoth TRC and DRC are simple subsets of first-order predicate logic.
Expressions (predicates) in the calculus are called formulas. An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true.
9/28/2010
10
Tuple Relational Calculus
Query: {T|P(T)}T is tuple variable
19
T is tuple variableP(T) is a formula that describes T
Result, the set of all tuples t for which P(t) evaluates True.
Find all sailors with a rating above 7.}7|{ >∧∈ ratingSSailorsSS
in our book: Sailors(S) specifies that the range relation of tuple variable S is Sailors (s may take as its value any individual tuple from Sailors)
}7.|{ >∧∈ ratingSSailorsSSSailorsS∈
Tuple Relational Calculus
Atomic formulaeg in our book: Sailors(S)lR Re∈ S ilS
20
eg. in our book: Sailors(S)Rel: range relation of R
R.a op S.b , op is one ofR.a op constant eg. 7. >ratingS
lR Re∈
< > = ≤ ≥ ≠, , , , ,
SailorsS∈
9/28/2010
11
TRC
FormulaAny atomic formula
21
Any atomic formula
(in our book: NOT(p), p AND q, p OR q ) Existential quantifiersUniversal quantifiers
E l
qpqpp ∨∧¬ ,,
))(( RpR∃))(( RpR∀
ExampleFind the names and ages of sailors with a rating above 7
)}....7.(|{ ageSagePnameSnamePratingSSailorsSP =∧=∧>∈∃
Free and Bound Variables
The use of quantifiers and in a formula is said to bind X
∃ X ∀ X22
said to bind X.A variable that is not bound is free.
Revisit Query: {T|P(T)}T is tuple variableP(T) is a formula that describes T
Th i i i i h i bl There is an important restriction: the variables T that appear to the left of `|’ must be the onlyfree variables in the formula p(...).
9/28/2010
12
Sample Queries in Tuple Relational Calculus
23
Practice24
1. Retrieve the birth date and address of the employee whose first name is John
2. Retrieve all employees whose salary is higher than 5000003. For each employee, retrieve the employee’s first and last
name and the first and last name of his/her immediate supervisor
4. List the name of employees who have at least one dependent
9/28/2010
13
Domain Relational Calculus
Query has the form:| ⎛ ⎞⎧ ⎫
25
x x xn p x x xn1 2 1 2, ,..., | , ,...,⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪
Answer includes all tuples thatmake the formula be true.
x x xn1 2, ,...,p x x xn1 2, ,...,⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
Formula is recursively defined, starting withFormula is recursively defined, starting withsimple atomic formulas (getting tuples fromrelations or making comparisons of values), and building bigger and better formulas usingthe logical connectives.
Free and Bound Variables
The use of quantifiers and in a formula is d bi d X
∃ X ∀ Xsaid to bind X.
A variable that is not bound is free.
Let us revisit the definition of a query:
x x xn p x x xn1 2 1 2, ,..., | , ,...,⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪
Th i i i i h i bl There is an important restriction: the variables x1, ..., xn that appear to the left of `|’ must be the only free variables in the formula p(...).
9/28/2010
14
DRC Formulas
Atomic formula: or X op Y or X op constantx x xn Rname1 2 ∈
27
, or X op Y, or X op constantop is one of
Formula:an atomic formula, or
, where p and q are formulas, or
x x xn Rname1 2, ,..., ∈< > = ≤ ≥ ≠, , , , ,
¬ ∧ ∨p p q p q, ,∃X X( ( )) , where variable X is free in p(X), or
, where variable X is free in p(X)
∃X p X( ( ))∀X p X( ( ))
Find all sailors with a rating above 7
I N T A I N T A Sailors T, , , | , , , ∈ ∧ >⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪7
28
The condition ensures that the domain variables I, N, T and A are bound to fields of the same Sailors tuple.The term to the left of `|’ (which should be read as such that) says that every tuple
I N T A Sailors, , , ∈
I N T A, , ,I N T A, , ,
read as such that) says that every tuple that satisfies T>7 is in the answer.Modify this query to answer:
Find sailors who are older than 18
9/28/2010
15
Find sailors rated > 7 who’ve reserved boat #103
I N T A I N T A Sailors T, , , | , , , ∈ ∧ > ∧⎧⎨⎪
⎪7
29
We have used as a shorthand for
, , , | , , ,⎩⎪
∃ ∈ ∧ = ∧ =⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
⎫⎬⎪
⎭⎪Ir Br D Ir Br D serves Ir I Br, , , , Re 103
( )∃ Ir Br D, , . . .
( )Note the use of to find a tuple in Reserves that `joins with’ the Sailors tuple under consideration.
( )( )( )∃ ∃ ∃Ir Br D . . .∃
Find sailors rated > 7 who’ve reserved a red boat
I N T A I N T A Sailors T, , , | , , , ∈ ∧ > ∧⎧⎨⎪
⎪7
30
Observe how the parentheses control the scope of
, , , | , , ,⎩⎪
∃ ∈ ∧ = ∧⎛
⎝
⎜⎜Ir Br D Ir Br D serves Ir I, , , , Re
∃ ∈ ∧ = ∧ =⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
⎞
⎠
⎟⎟⎟
⎫
⎬⎪
⎭⎪
B BN C B BN C Boats B Br C red, , , , ' '
Observe how the parentheses control the scope of each quantifier’s binding.This may look cumbersome, but with a good user interface, it is very intuitive.
9/28/2010
16
Find sailors who’ve reserved all boats
I N T A I N T A Sailors, , , | , , , ∈ ∧⎧⎨⎪
⎪
31
, , , | , , ,⎩⎪
∀ ¬ ∈ ∨⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
⎛
⎝
⎜⎜⎜
B BN C B BN C Boats, , , ,
∃ ∈ ∧ = ∧ =⎛
⎝
⎜⎜
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
⎞
⎠
⎟⎟⎟
⎞
⎠
⎟⎟⎟
⎫
⎬⎪
⎭⎪
Ir Br D Ir Br D serves I Ir Br B, , , , Re
B BN CFind all sailors I such that for each 3-tuple either it is not a tuple in Boats or there is a tuple in Reserves showing that sailor I has reserved it.
B BN C, ,
Find sailors who’ve reserved all boats (again!)
I N T A I N T A Sailors, , , | , , , ∈ ∧⎧⎨⎪
⎪
32
, , , | , , ,⎩⎪
∀ ∈B BN C Boats, ,
∃ ∈ = ∧ =⎛
⎝⎜⎜
⎞
⎠⎟⎟
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
⎫⎬⎪
⎭⎪Ir Br D serves I Ir Br B, , Re
Simpler notation, same query. (Much clearer!)To find sailors who’ve reserved all red boats:
C red Ir Br D serves I Ir Br B≠ ∨ ∃ ∈ = ∧ =⎛
⎝⎜⎜
⎞
⎠⎟⎟
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
⎫⎬⎪
⎭⎪' ' , , Re.....
9/28/2010
17
Unsafe Queries, Expressive Power
It is possible to write syntactically correct calculus queries that have an infinite number of answers! Such
33
queries that have an infinite number of answers! Such queries are called unsafe.
e.g.,
It is known that every query that can be expressed in relational algebra can be expressed as a safe query
S S Sailors| ¬ ∈⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
in DRC / TRC; the converse is also true. Relational Completeness: Query language (e.g., SQL) can express every query that is expressible in relational algebra/calculus.
Summary
Relational calculus is non-operational, and users define q eries in terms of what the want not in
34
define queries in terms of what they want, not in terms of how to compute it. (Declarativeness.)Algebra and safe calculus have same expressive power, leading to the notion of relational completeness.
9/28/2010
18
Reference35
Raghu Ramakrishnan: Database Management S stemsSystems