ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

18
9/28/2010 1 ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION JING YANG 2010 FALL Class 11: The Relational Algebra and Relational Calculus (2) Operations of Relational Algebra 2

Transcript of ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

Page 1: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

1

ITCS 3160DATA BASE DESIGN AND IMPLEMENTATION

JING YANG2010 FALL

Class 11: The Relational Algebra and Relational Calculus (2)

Operations of Relational Algebra2

Page 2: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

2

Operations of Relational Algebra (cont’d.)

3

Notation for Query Trees

Query treeR h i l i f l f d f

4

Represents the input relations of query as leaf nodes of the treeRepresents the relational algebra operations as internal nodes

Page 3: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

3

5

Additional Relational Operations

Generalized projectionAll f i f ib b i l d d i h

6

Allows functions of attributes to be included in the projection list

Aggregate functions and groupingCommon functions applied to collections of numeric

l values Include SUM, AVERAGE, MAXIMUM, and MINIMUM

Page 4: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

4

Additional Relational Operations (cont’d.)

Group tuples by the value of some of their attrib tes

7

attributes Apply aggregate function independently to each group

Recursive Closure Operations

Operation applied to a recursive relationshipbetween t ples of same t pe

8

between tuples of same type

Question: what is the purpose of the following expression?

What if we want to find all employees supervised by James Borg?

Page 5: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

5

OUTER JOIN Operations

Outer joinsK ll l i R ll h i S ll h i b h

9

Keep all tuples in R, or all those in S, or all those in both relations regardless of whether or not they have matching tuples in the other relationTypes• LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER

JOIN

OUTER JOIN Operations

Example:

10

Page 6: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

6

The OUTER UNION Operation

Take union of tuples from two relations that have some common attrib tes

11

some common attributesNot union (type) compatible

Partially compatibleAll tuples from both relations included in the resultTuples with the same value combination will appear only once

Summary

The relational model has rigorously defined query languages that are simple and powerful

12

languages that are simple and powerful.Relational algebra is more operational; useful as internal representation for query evaluation plans.Several ways of expressing a given query; a query optimizer should choose the most efficient version.

Page 7: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

7

RELATIONAL CALCULUS

Relational Calculus

Declarative expression S if i l d l l

14

Specify a retrieval request; nonprocedural language

Any retrieval that can be specified in basic relational algebra

Can also be specified in relational calculus

Page 8: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

8

First-Order Predicate Logic

Propositional logic is concerned only with sentential connectives such as and or not

15

connectives such as and, or, not.Proposition: a statement that affirms or denies somethingPeter is tall

First order predicate logic additionally covers predicates and quantification

Predicates16

Example: { | }{x | x is a positive integer less than 4} is the set {1,2,3}.An element of the set {x | P(x)}, is an object t for which the statement P(t) is true. With such statements, P(x) is referred to as the Predicatemaking x the subject of the proposition.

Page 9: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

9

Quantification17

All known human languages make use of q antificationquantification.For example, in English:

Every student in my class is smart. There was somebody in my class that was able to correctly answer every one of the questions I gave.

'Most of the people I talked to didn't have a clue who the candidates were.

Relational Calculus

Comes in two flavors: Tuple relational calculus (TRC) and D i l ti l l l (DRC)

18

Domain relational calculus (DRC).Calculus has variables, constants, comparison ops, logical connectives and quantifiers.

TRC: Variables range over (i.e., get bound to) tuples.DRC: Variables range over domain elements (= field values).B th TRC d DRC i l b t f fi t d di t l iBoth TRC and DRC are simple subsets of first-order predicate logic.

Expressions (predicates) in the calculus are called formulas. An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true.

Page 10: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

10

Tuple Relational Calculus

Query: {T|P(T)}T is tuple variable

19

T is tuple variableP(T) is a formula that describes T

Result, the set of all tuples t for which P(t) evaluates True.

Find all sailors with a rating above 7.}7|{ >∧∈ ratingSSailorsSS

in our book: Sailors(S) specifies that the range relation of tuple variable S is Sailors (s may take as its value any individual tuple from Sailors)

}7.|{ >∧∈ ratingSSailorsSSSailorsS∈

Tuple Relational Calculus

Atomic formulaeg in our book: Sailors(S)lR Re∈ S ilS

20

eg. in our book: Sailors(S)Rel: range relation of R

R.a op S.b , op is one ofR.a op constant eg. 7. >ratingS

lR Re∈

< > = ≤ ≥ ≠, , , , ,

SailorsS∈

Page 11: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

11

TRC

FormulaAny atomic formula

21

Any atomic formula

(in our book: NOT(p), p AND q, p OR q ) Existential quantifiersUniversal quantifiers

E l

qpqpp ∨∧¬ ,,

))(( RpR∃))(( RpR∀

ExampleFind the names and ages of sailors with a rating above 7

)}....7.(|{ ageSagePnameSnamePratingSSailorsSP =∧=∧>∈∃

Free and Bound Variables

The use of quantifiers and in a formula is said to bind X

∃ X ∀ X22

said to bind X.A variable that is not bound is free.

Revisit Query: {T|P(T)}T is tuple variableP(T) is a formula that describes T

Th i i i i h i bl There is an important restriction: the variables T that appear to the left of `|’ must be the onlyfree variables in the formula p(...).

Page 12: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

12

Sample Queries in Tuple Relational Calculus

23

Practice24

1. Retrieve the birth date and address of the employee whose first name is John

2. Retrieve all employees whose salary is higher than 5000003. For each employee, retrieve the employee’s first and last

name and the first and last name of his/her immediate supervisor

4. List the name of employees who have at least one dependent

Page 13: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

13

Domain Relational Calculus

Query has the form:| ⎛ ⎞⎧ ⎫

25

x x xn p x x xn1 2 1 2, ,..., | , ,...,⎛

⎜⎜⎜

⎟⎟⎟

⎨⎪

⎩⎪

⎬⎪

⎭⎪

Answer includes all tuples thatmake the formula be true.

x x xn1 2, ,...,p x x xn1 2, ,...,⎛

⎜⎜⎜

⎟⎟⎟

Formula is recursively defined, starting withFormula is recursively defined, starting withsimple atomic formulas (getting tuples fromrelations or making comparisons of values), and building bigger and better formulas usingthe logical connectives.

Free and Bound Variables

The use of quantifiers and in a formula is d bi d X

∃ X ∀ Xsaid to bind X.

A variable that is not bound is free.

Let us revisit the definition of a query:

x x xn p x x xn1 2 1 2, ,..., | , ,...,⎛

⎜⎜⎜

⎟⎟⎟

⎨⎪

⎩⎪

⎬⎪

⎭⎪

Th i i i i h i bl There is an important restriction: the variables x1, ..., xn that appear to the left of `|’ must be the only free variables in the formula p(...).

Page 14: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

14

DRC Formulas

Atomic formula: or X op Y or X op constantx x xn Rname1 2 ∈

27

, or X op Y, or X op constantop is one of

Formula:an atomic formula, or

, where p and q are formulas, or

x x xn Rname1 2, ,..., ∈< > = ≤ ≥ ≠, , , , ,

¬ ∧ ∨p p q p q, ,∃X X( ( )) , where variable X is free in p(X), or

, where variable X is free in p(X)

∃X p X( ( ))∀X p X( ( ))

Find all sailors with a rating above 7

I N T A I N T A Sailors T, , , | , , , ∈ ∧ >⎧⎨⎪

⎩⎪

⎫⎬⎪

⎭⎪7

28

The condition ensures that the domain variables I, N, T and A are bound to fields of the same Sailors tuple.The term to the left of `|’ (which should be read as such that) says that every tuple

I N T A Sailors, , , ∈

I N T A, , ,I N T A, , ,

read as such that) says that every tuple that satisfies T>7 is in the answer.Modify this query to answer:

Find sailors who are older than 18

Page 15: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

15

Find sailors rated > 7 who’ve reserved boat #103

I N T A I N T A Sailors T, , , | , , , ∈ ∧ > ∧⎧⎨⎪

⎪7

29

We have used as a shorthand for

, , , | , , ,⎩⎪

∃ ∈ ∧ = ∧ =⎛

⎜⎜

⎟⎟

⎫⎬⎪

⎭⎪Ir Br D Ir Br D serves Ir I Br, , , , Re 103

( )∃ Ir Br D, , . . .

( )Note the use of to find a tuple in Reserves that `joins with’ the Sailors tuple under consideration.

( )( )( )∃ ∃ ∃Ir Br D . . .∃

Find sailors rated > 7 who’ve reserved a red boat

I N T A I N T A Sailors T, , , | , , , ∈ ∧ > ∧⎧⎨⎪

⎪7

30

Observe how the parentheses control the scope of

, , , | , , ,⎩⎪

∃ ∈ ∧ = ∧⎛

⎜⎜Ir Br D Ir Br D serves Ir I, , , , Re

∃ ∈ ∧ = ∧ =⎛

⎜⎜⎜

⎟⎟⎟

⎟⎟⎟

⎬⎪

⎭⎪

B BN C B BN C Boats B Br C red, , , , ' '

Observe how the parentheses control the scope of each quantifier’s binding.This may look cumbersome, but with a good user interface, it is very intuitive.

Page 16: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

16

Find sailors who’ve reserved all boats

I N T A I N T A Sailors, , , | , , , ∈ ∧⎧⎨⎪

31

, , , | , , ,⎩⎪

∀ ¬ ∈ ∨⎛

⎜⎜⎜

⎟⎟⎟

⎜⎜⎜

B BN C B BN C Boats, , , ,

∃ ∈ ∧ = ∧ =⎛

⎜⎜

⎜⎜

⎟⎟

⎟⎟⎟

⎟⎟⎟

⎬⎪

⎭⎪

Ir Br D Ir Br D serves I Ir Br B, , , , Re

B BN CFind all sailors I such that for each 3-tuple either it is not a tuple in Boats or there is a tuple in Reserves showing that sailor I has reserved it.

B BN C, ,

Find sailors who’ve reserved all boats (again!)

I N T A I N T A Sailors, , , | , , , ∈ ∧⎧⎨⎪

32

, , , | , , ,⎩⎪

∀ ∈B BN C Boats, ,

∃ ∈ = ∧ =⎛

⎝⎜⎜

⎠⎟⎟

⎜⎜

⎟⎟

⎫⎬⎪

⎭⎪Ir Br D serves I Ir Br B, , Re

Simpler notation, same query. (Much clearer!)To find sailors who’ve reserved all red boats:

C red Ir Br D serves I Ir Br B≠ ∨ ∃ ∈ = ∧ =⎛

⎝⎜⎜

⎠⎟⎟

⎜⎜

⎟⎟

⎫⎬⎪

⎭⎪' ' , , Re.....

Page 17: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

17

Unsafe Queries, Expressive Power

It is possible to write syntactically correct calculus queries that have an infinite number of answers! Such

33

queries that have an infinite number of answers! Such queries are called unsafe.

e.g.,

It is known that every query that can be expressed in relational algebra can be expressed as a safe query

S S Sailors| ¬ ∈⎛

⎜⎜

⎟⎟

⎧⎨⎪

⎩⎪

⎫⎬⎪

⎭⎪

in DRC / TRC; the converse is also true. Relational Completeness: Query language (e.g., SQL) can express every query that is expressible in relational algebra/calculus.

Summary

Relational calculus is non-operational, and users define q eries in terms of what the want not in

34

define queries in terms of what they want, not in terms of how to compute it. (Declarativeness.)Algebra and safe calculus have same expressive power, leading to the notion of relational completeness.

Page 18: ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

9/28/2010

18

Reference35

Raghu Ramakrishnan: Database Management S stemsSystems