Relational Model and Algebra - Duke University

46
Relational Model and Algebra Introduction to Databases CompSci 316 Spring 2020

Transcript of Relational Model and Algebra - Duke University

Page 1: Relational Model and Algebra - Duke University

Relational Model and Algebra

Introduction to DatabasesCompSci 316 Spring 2020

Page 2: Relational Model and Algebra - Duke University

Announcements (Tue. Jan. 14)โ€ข You should be on Piazza and Gradescope

โ€ข Otherwise, let the instructor know after class

โ€ข HW1 has been posted, due next Tuesday 11:59 pmโ€ข Instant feedback, multiple submissions allowed until correct!โ€ข 5% / hour late submission penalty โ€ข Use pgweb from course website to try your queries on small Beers

datasetโ€ข If you join the class after Tuesday 01/14, let the instructor know

โ€ข Office hours posted on course websiteโ€ข There is at least one everyday except Saturday

2

Page 3: Relational Model and Algebra - Duke University

Todayโ€™s plan

โ€ข Revisit relational modelโ€ข Revisit simple SQL queries and its semanticโ€ข Start relational algebra

3

Page 4: Relational Model and Algebra - Duke University

The famous โ€œBeersโ€ database4

BarEach has an address

DrinkerEach has an address

BeerEach has a brewer

Drinker Frequents Barsโ€œXโ€ times a week

Bar Serves BeerAt price โ€œYโ€

Drinker Likes Beer

Your database for HW1

Page 5: Relational Model and Algebra - Duke University

โ€œBeersโ€ as a Relational Database5See online database for more tuples

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Name brewer

Budweiser Anheuser-Busch Inc.

Corona Grupo Modelo

Dixie Dixie Brewing

name address

Amy 100 W. Main Street

Ben 101 W. Main Street

Dan 300 N. Duke Street

name address

The Edge108 Morris Street

Satisfaction905 W. Main Street

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

drinker beer

Amy Corona

Dan Budweiser

Dan Corona

Ben Budweiser

Bar

Beer

Drinker

Likes

Frequents

Serves

What is an example of a

โ€ข Relationโ€ข Attributeโ€ข Tupleโ€ข Schemaโ€ข Instance

What is

โ€ข Set semantic โ€ข in relational model

โ€ข Bag semanticโ€ข In SQL (why)

โ€ข Set semantic โ€ข No duplicates, Order of tuples does not matter

โ€ข Bag semanticโ€ข Duplicates allowed, for efficiency and flexibilityโ€ข Do not want duplicates? Use SELECT DISTINCT โ€ฆ

Page 6: Relational Model and Algebra - Duke University

Basic queries: SFW statement

โ€ข SELECT ๐ด", ๐ด#, โ€ฆ, ๐ด$FROM ๐‘…", ๐‘…#, โ€ฆ, ๐‘…&WHERE ๐‘๐‘œ๐‘›๐‘‘๐‘–๐‘ก๐‘–๐‘œ๐‘›

โ€ข SELECT, FROM, WHERE are often referred to as SELECT, FROM, WHERE โ€œclausesโ€

โ€ข Each query must have a SELECT and a FROM

โ€ข WHERE is optional

6

In HW1, you can only use SFW

Page 7: Relational Model and Algebra - Duke University

Example: reading a table

โ€ข SELECT * FROM Serves

โ€ข Single-table queryโ€ข * is a shorthand for โ€œall columnsโ€

7

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves

Page 8: Relational Model and Algebra - Duke University

Example: ORDER BY

โ€ข SELECT * FROM ServesORDER BY beer

โ€ข Equivalent to โ€œORDER BY beer ascโ€ (asc is default option)โ€ข For descending order, use โ€œdescโ€โ€ข Can combine multiple ordersโ€ข What does this return?โ€ข ORDER BY beer asc, price desc

8

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves

Page 9: Relational Model and Algebra - Duke University

Example: some columns and DISTINCT

โ€ข SELECT beer FROM Serves

โ€ข Only want unique values? Use DISTINCT

โ€ข SELECT DISTINCT beer FROM Serves

9

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves

Returns a set

Returns a bag

Page 10: Relational Model and Algebra - Duke University

Example: selecting few rowsโ€ข SELECT beer AS mybeer

FROM ServesWHERE price < 2.75

โ€ข SELECT S.beerFROM Serves SWHERE bar = โ€˜The Edgeโ€™

10

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves

โ€ข SELECT list can contain expressionsCan also use built-in functions such as SUBSTR, ABS, etc.

โ€ข NOT EQUAL TO: Use <>โ€ข LIKE matches a string against a pattern

% matches any sequence of zero or more characters

What does these return?

Page 11: Relational Model and Algebra - Duke University

Example: Joinโ€ข Find addresses of all bars that โ€˜Danโ€™ frequents

โ€ข SELECT B.addressFROM Bar B, Frequents FWHERE B.name = F.bar

AND F.drinker = โ€˜Danโ€™

11

name address

The Edge 108 Morris Street

Satisfaction 905 W. Main Street

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Bar

Frequents

Page 12: Relational Model and Algebra - Duke University

Semantics of SFWโ€ข SELECT ๐ธ", ๐ธ#, โ€ฆ, ๐ธ$

FROM ๐‘…", ๐‘…#, โ€ฆ, ๐‘…&WHERE ๐‘๐‘œ๐‘›๐‘‘๐‘–๐‘ก๐‘–๐‘œ๐‘›โ€ข For each ๐‘ก" in ๐‘…":

For each ๐‘ก# in ๐‘…#: โ€ฆ โ€ฆFor each ๐‘ก& in ๐‘…&:

If ๐‘๐‘œ๐‘›๐‘‘๐‘–๐‘ก๐‘–๐‘œ๐‘› is true over ๐‘ก", ๐‘ก#, โ€ฆ, ๐‘ก&:

Compute and output ๐ธ", ๐ธ#, โ€ฆ, ๐ธ$ as a row

12

1. Apply โ€œFROMโ€Form โ€œcross-productโ€ of R1, .., Rm

2. Apply โ€œWHEREโ€Only consider satisfying rows

3. Apply โ€œSELECTโ€Output the desired columns

Page 13: Relational Model and Algebra - Duke University

Step 1: Illustration of Semantics of SFW

โ€ข NOTE: This is โ€œNOT HOWโ€ the DBMS outputs the result, but โ€œWHATโ€ it outputs!

13

โ€ข SELECT B.addressFROM Bar B, Frequents F

WHERE B.name = F.barAND F.drinker = โ€˜Danโ€™

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

name address drinker bar times_a_week

The Edge 108 Morris Street

Ben Satisfaction 2

The Edge 108 Morris Street

Dan The Edge 1

The Edge 108 Morris Street

Dan Satisfaction 2

Satisfaction 905 W. Main Street

Ben Satisfaction 2

Satisfaction 905 W. Main Street

Dan The Edge 1

Satisfaction 905 W. Main Street

Dan Satisfaction 2

Form a โ€œCross productโ€ of two relations

Page 14: Relational Model and Algebra - Duke University

Step 2: Illustration of Semantics of SFW

โ€ข NOTE: This is โ€œNOT HOWโ€ the DBMS outputs the result, but โ€œWHATโ€ it outputs!

14

โ€ข SELECT B.addressFROM Bar B, Frequents F

WHERE B.name = F.barAND F.drinker = โ€˜Danโ€™

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

name address drinker bar times_a_week

The Edge 108 Morris Street

Ben Satisfaction 2

The Edge 108 Morris Street

Dan The Edge 1

The Edge 108 Morris Street

Dan Satisfaction 2

Satisfaction 905 W. Main Street

Ben Satisfaction 2

Satisfaction 905 W. Main Street

Dan The Edge 1

Satisfaction 905 W. Main Street

Dan Satisfaction 2

Discard rows that do not satisfy WHERE condition

Page 15: Relational Model and Algebra - Duke University

Step 3: Illustration of Semantics of SFW

โ€ข NOTE: This is โ€œNOT HOWโ€ the DBMS outputs the result, but โ€œWHATโ€ it outputs!

15

โ€ข SELECT B.addressFROM Bar B, Frequents F

WHERE B.name = F.barAND F.drinker = โ€˜Danโ€™

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

name address drinker bar times_a_week

The Edge 108 Morris Street

Ben Satisfaction 2

The Edge 108 Morris Street

Dan The Edge 1

The Edge 108 Morris Street

Dan Satisfaction 2

Satisfaction 905 W. Main Street

Ben Satisfaction 2

Satisfaction 905 W. Main Street

Dan The Edge 1

Satisfaction 905 W. Main Street

Dan Satisfaction 2

Output the โ€œaddressโ€ output of rows that survived

Page 16: Relational Model and Algebra - Duke University

Final output: Illustration of Semantics of SFWโ€ข NOTE: This is โ€œNOT HOWโ€ the DBMS outputs the result, but โ€œWHATโ€ it

outputs!

16

โ€ข SELECT B.addressFROM Bar B, Frequents F

WHERE B.name = F.barAND F.drinker = โ€˜Danโ€™

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

address

108 Morris Street

905 W. Main Street

Output the โ€œaddressโ€ output of rows that survived

Page 17: Relational Model and Algebra - Duke University

SQL vs. C++, Java, Pythonโ€ฆ17

Page 18: Relational Model and Algebra - Duke University

SQL vs. C++, Java, Pythonโ€ฆ

SQL is declarativeโ€ข Programmer specifies what answers a query should return, โ€ข but not how the query is executedโ€ข DBMS picks the best execution strategy based on

availability of indexes, data/workload characteristics, etc.โ€ข Not a โ€œProceduralโ€ or โ€œOperationalโ€ language like C++,

Java, Pythonโ€ข There are several ways to write a query, but equivalent

queries always provide the same (equivalent) resultsโ€ข SQL (+ its execution and optimizations) is based on a strong

foundation of โ€œRelational Algebraโ€

18

Page 19: Relational Model and Algebra - Duke University

Relational algebra

A language for querying relational data based on โ€œoperatorsโ€

19

RelOp

RelOp

โ€ข Core operators:โ€ข Selection, projection, cross product, union, difference,

and renaming

โ€ข Additional, derived operators:โ€ข Join, natural join, intersection, etc.

โ€ข Compose operators to make complex queries

Page 20: Relational Model and Algebra - Duke University

Selectionโ€ข Input: a table ๐‘…โ€ข Notation: ๐œŽ/๐‘…

โ€ข ๐‘ is called a selection condition (or predicate)

โ€ข Purpose: filter rows according to some criteriaโ€ข Output: same columns as ๐‘…, but only rows of ๐‘… that satisfy ๐‘

(set!)

20

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Servesbar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

๐ˆ๐’‘๐’“๐’Š๐’„๐’†7๐Ÿ.๐Ÿ•๐Ÿ“ Serves

Example: Find beers with price < 2.75

Equivalent SQL query?No actual deletion!

Page 21: Relational Model and Algebra - Duke University

More on selectionโ€ข Selection condition can include any column of ๐‘…, constants,

comparison (=, โ‰ค, etc.) and Boolean connectives (โˆง: and, โˆจ: or, ยฌ: not)โ€ข Example: Serves tuples for โ€œThe Edgeโ€ or price >= 2.75

๐œŽABCDEFGH IJKHEโˆจ/CLMH N #.OP ๐‘†๐‘’๐‘Ÿ๐‘ฃ๐‘’๐‘ โ€ข You must be able to evaluate the condition over each single

row of the input table!โ€ข Example: the most expensive beer at any bar

๐œŽ/CLMH N HVHCW /CLMH L$ XHCVHY ๐‘ˆ๐‘ ๐‘’๐‘Ÿ

21

WRONG!

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves

Page 22: Relational Model and Algebra - Duke University

Projection

โ€ข Input: a table ๐‘…โ€ข Notation: ๐œ‹\๐‘…โ€ข ๐ฟ is a list of columns in ๐‘…

โ€ข Purpose: output chosen columnsโ€ข Output: same rows, but only the columns in ๐ฟ (๐‘ ๐‘’๐‘ก!)

22

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

Serves ๐…๐’ƒ๐’†๐’†๐’“,๐’‘๐’“๐’Š๐’„๐’† Serves

Example: Find all the prices for each beer

beer price

Budweiser 2.50

Corona 3.00

Budweiser 2.25

Output of ๐œ‹AHHCServes?

Page 23: Relational Model and Algebra - Duke University

Cross productโ€ข Input: two tables ๐‘… and ๐‘†โ€ข Natation: ๐‘…ร—๐‘†โ€ข Purpose: pairs rows from two tablesโ€ข Output: for each row ๐‘Ÿ in ๐‘… and each ๐‘  in ๐‘†, output a row ๐‘Ÿ๐‘ 

(concatenation of ๐‘Ÿ and ๐‘ )

23

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

name address drinker bar times_a_week

The Edge 108 Morris Street

Ben Satisfaction 2

The Edge 108 Morris Street

DanThe Edge

1

The Edge 108 Morris Street

DanSatisfaction

2

Satisfaction 905 W. Main Street

Ben Satisfaction 2

Satisfaction 905 W. Main Street

DanThe Edge

1

Satisfaction 905 W. Main Street

Dan Satisfaction 2

Bar x Frequents

Note: orderingof columns does not matter, so R X S = S X R(commutative)

Page 24: Relational Model and Algebra - Duke University

Derived operator: join(A.k.a. โ€œtheta-joinโ€: most general joins)โ€ข Input: two tables ๐‘… and ๐‘†โ€ข Notation: ๐‘… โ‹ˆ/ ๐‘†

โ€ข ๐‘ is called a join condition (or predicate)

โ€ข Purpose: relate rows from two tables according to some criteria

โ€ข Output: for each row ๐‘Ÿ in ๐‘… and each row ๐‘  in ๐‘†, output a row ๐‘Ÿ๐‘  if ๐‘Ÿ and ๐‘  satisfy ๐‘

โ€ข Shorthand for ๐œŽ/ ๐‘…ร—๐‘†

Predicate p only has equality (A = 5 โˆง B = 7) : equijoin

24

One of the most importantoperations!

Page 25: Relational Model and Algebra - Duke University

Join example

โ€ข Extend Frequents relation with addresses of the bars๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘  โ‹ˆABCD$B&H ๐ต๐‘Ž๐‘Ÿ

25

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

name address drinker bar times_a_week

The Edge 108 Morris Street

Ben Satisfaction 2

The Edge 108 Morris Street

Dan The Edge 1

The Edge 108 Morris Street

Dan Satisfaction 2

Satisfaction 905 W. Main Street

Ben Satisfaction 2

Satisfaction 905 W. Main Street

Dan The Edge 1

Satisfaction 905 W. Main Street

Dan Satisfaction 2

Ambiguous attribute?Use Bar.name

Page 26: Relational Model and Algebra - Duke University

(Lecture 3 -- contd)

26

Page 27: Relational Model and Algebra - Duke University

Announcements (Thu. Jan. 16)โ€ข Reminder: HW1 due next Tuesday 01/21โ€ข HW2 on RA to be posted next Tuesday 01/21, due on 01/28

โ€ข HW2-Q1 on gradiance already open if you want to start earlyโ€ข Check gradiance code on Sakai announcements

โ€ข In-class lab on RA next Tuesday 01/21โ€ข Part of HW2 (~ 2 questions) in class to get the set up ready with TAs helpโ€ข Last 30-40 mins of Tuesdayโ€™s lectureโ€ข You can work in groups of size 2 or 3, but would submit your own solutionโ€ข You can submit by the next day -- 10% extra credit for finishing all questions

correctly in class (last timestamp <= 4:20 pm)!

โ€ข In-class quiz next Thursday 01/23โ€ข You can work in groups of size 2 or 3, but would submit your own solutionโ€ข 50% for attempt, 50% for correct answerโ€ข What if you miss a class? We would drop 25% (ceiling) of the lowest grades while

calculating your final score for quiz, i.e. if we have 4 quizzes 1 dropped, 5-8 quizzes 2 dropped, โ€ฆ

โ€ข Quiz or Lab -- you can submit while not being in the class too, but you would miss the fun of discussing with others (+ help from TAs for Labs)!

27

All on course website schedule

Page 28: Relational Model and Algebra - Duke University

Join Types

โ€ข Theta Join

โ€ข Equi-Join

โ€ข Natural Join

โ€ข Later, (left/right) outer join, semi-join

28

Page 29: Relational Model and Algebra - Duke University

Derived operator: natural join

โ€ข Input: two tables ๐‘… and ๐‘†โ€ข Notation: ๐‘… โ‹ˆ ๐‘† (i.e. no subscript)โ€ข Purpose: relate rows from two tables, andโ€ข Enforce equality between identically named columnsโ€ข Eliminate one copy of identically named columns

โ€ข Shorthand for ๐œ‹\ ๐‘… โ‹ˆ/ ๐‘† , whereโ€ข ๐‘ equates each pair of columns common to ๐‘… and ๐‘†โ€ข ๐ฟ is the union of column names from ๐‘… and ๐‘† (with

duplicate columns removed)

29

Page 30: Relational Model and Algebra - Duke University

Natural join example30

Serves โ‹ˆ ๐ฟ๐‘–๐‘˜๐‘’๐‘ = ๐œ‹? ๐‘†๐‘’๐‘Ÿ๐‘ฃ๐‘’๐‘  โ‹ˆ? ๐ฟ๐‘–๐‘˜๐‘’๐‘ = ๐œ‹ABC,AHHC,/CLMH,JCL$mHC ๐‘†๐‘’๐‘Ÿ๐‘ฃ๐‘’๐‘  โ‹ˆXHCVHY.AHHCD

\LmHY.AHHC๐ฟ๐‘–๐‘˜๐‘’๐‘ 

bar beer price

The Edge Budweiser 2.50

The Edge Corona 3.00

Satisfaction Budweiser 2.25

drinker beer

Amy Corona

Dan Budweiser

Dan Corona

Ben Budweiser

LikesServes

bar beer price drinker

The Edge Budweiser 2.50 Dan

The Edge Budweiser 2.50 Ben

The Edge Corona 3.00 Amy

The Edge Corona 3.00 Dan

... โ€ฆ. โ€ฆ..

Natural Join is on beer

Only one column for beer in the output

What happens if the tables have two or more common columns?

Serves โ‹ˆ ๐ฟ๐‘–๐‘˜๐‘’๐‘ 

Page 31: Relational Model and Algebra - Duke University

Union

โ€ข Input: two tables ๐‘… and ๐‘†โ€ข Notation: ๐‘… โˆช ๐‘†โ€ข ๐‘… and ๐‘† must have identical schema

โ€ข Output:โ€ข Has the same schema as ๐‘… and ๐‘†โ€ข Contains all rows in ๐‘… and all rows in ๐‘† (with duplicate

rows removed)

31

Example on board

Important for set operations:

Union Compatibility

Page 32: Relational Model and Algebra - Duke University

Difference

โ€ข Input: two tables ๐‘… and ๐‘†โ€ข Notation: ๐‘… โˆ’ ๐‘†โ€ข ๐‘… and ๐‘† must have identical schema

โ€ข Output:โ€ข Has the same schema as ๐‘… and ๐‘†โ€ข Contains all rows in ๐‘… that are not in ๐‘†

32

Example on board

Important for set operations:

Union Compatibility

Page 33: Relational Model and Algebra - Duke University

Derived operator: intersection

โ€ข Input: two tables ๐‘… and ๐‘†โ€ข Notation: ๐‘… โˆฉ ๐‘†โ€ข ๐‘… and ๐‘† must have identical schema

โ€ข Output:โ€ข Has the same schema as ๐‘… and ๐‘†โ€ข Contains all rows that are in both ๐‘… and ๐‘†

โ€ข How can you write it using other operators?

โ€ข Shorthand forโ€ข Also equivalent to ๐‘† โˆ’ ๐‘† โˆ’ ๐‘…โ€ข And to ๐‘… โ‹ˆ ๐‘†

33

๐‘… โˆ’ ๐‘… โˆ’ ๐‘†

Important for set operations:

Union Compatibility

Page 34: Relational Model and Algebra - Duke University

Expression tree notation34

๐ต๐‘Ž๐‘Ÿ ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ 

โ‹ˆJCL$mHCD$B&H

๐œ‹BJJCHYY

Also called logicalPlan tree

โ€ข Find addresses of all bars that โ€˜Danโ€™ frequents

name address

The Edge108 Morris Street

Satisfaction 905 W. Main Street

Bar

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequents

๐œ‹ BJJCHYY๐ต๐‘Ž๐‘Ÿ โ‹ˆJCL$mHCD

$B&H(๐œŽJCL$mHCDEqB$r๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ )

Equivalent to

๐œŽJCL$mHCDEqB$r

What if you move ๐œŽ to the top?Still correct?

More or less efficient?

Page 35: Relational Model and Algebra - Duke University

Using the same relation multiple times35

drinker bar times_a_week

Ben Satisfaction 2

Dan The Edge 1

Dan Satisfaction 2

Frequentsโ€ข Find drinkers who frequent both โ€œThe Edgeโ€ and โ€œSatisfactionโ€

๐œ‹JCL$mHC ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘  โ‹ˆ ABCDEFGH IJKHEโˆงABCDEXBsLYtBMsLu$EโˆงJCL$mHCDJCL$mHC

๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ WRONG!

๐œ‹vLJw

๐œŒ J",A",s" ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ โ‹ˆA"DEFGH IJKHEโˆงA#DEXBsLYtBMsLu$EโˆงJ"DJ#

๐œŒ J#,A#,s# ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ 

Rename!

Page 36: Relational Model and Algebra - Duke University

Renamingโ€ข Input: a table ๐‘…โ€ข Notation: ๐œŒX ๐‘…, ๐œŒ yw,yz,โ€ฆ ๐‘…, or ๐œŒX yw,yz,โ€ฆ ๐‘…โ€ข Purpose: โ€œrenameโ€ a table and/or its columnsโ€ข Output: a table with the same rows as ๐‘…, but called

differentlyโ€ข Used to

โ€ข Avoid confusion caused by identical column namesโ€ข Create identical column names for natural joins

โ€ข As with all other relational operators, it doesnโ€™t modify the databaseโ€ข Think of the renamed table as a copy of the original

36

Page 37: Relational Model and Algebra - Duke University

Summary of core operators

โ€ข Selection: ๐œŽ/๐‘…โ€ข Projection: ๐œ‹\๐‘…โ€ข Cross product: ๐‘…ร—๐‘†โ€ข Union: ๐‘… โˆช ๐‘†โ€ข Difference: ๐‘… โˆ’ ๐‘†โ€ข Renaming: ๐œŒX yw,yz,โ€ฆ ๐‘…โ€ข Does not really add โ€œprocessingโ€ power

37

Page 38: Relational Model and Algebra - Duke University

Summary of derived operators

โ€ข Join: ๐‘… โ‹ˆ/ ๐‘†โ€ข Natural join: ๐‘… โ‹ˆ ๐‘†โ€ข Intersection: ๐‘… โˆฉ ๐‘†

โ€ข Many moreโ€ข Semijoin, anti-semijoin, quotient, โ€ฆ

38

Page 39: Relational Model and Algebra - Duke University

Exercise

โ€ข Bars that drinkers in address โ€œ300 N. Duke Streetโ€ do not frequent

39Frequents(drinker, bar, times_of_week)Bar(name, address)Drinker(name, address)

Page 40: Relational Model and Algebra - Duke University

Exercise

โ€ข Bars that drinkers in address โ€œ300 N. Duke Streetโ€ do not frequent

40

Bars that the drinkers at thisaddress frequentAll bars

โˆ’

๐œ‹$B&H

๐ต๐‘Ž๐‘Ÿ ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ 

๐ท๐‘Ÿ๐‘–๐‘›๐‘˜๐‘’๐‘Ÿ

โ‹ˆ ๐‘‘๐‘Ÿ๐‘–๐‘›๐‘˜๐‘’๐‘Ÿ = ๐‘›๐‘Ž๐‘š๐‘’

๐œŽBJJCHYYDโ€๏ฟฝ๏ฟฝ๏ฟฝ ๏ฟฝ.qvmH XsCHHs"

๐œ‹ABC

Frequents(drinker, bar, times_of_week)Bar(name, address)Drinker(name, address)

๐œŒABC

Page 41: Relational Model and Algebra - Duke University

A trickier exerciseโ€ข For each bar, find the drinkers who frequent it max no.

times a week

41Frequents(drinker, bar, times_of_week)Bar(name, address)Drinker(name, address)

Page 42: Relational Model and Algebra - Duke University

A trickier exerciseโ€ข For each bar, find the drinkers who frequent it max no.

times a weekโ€ข Who do NOT visit a bar max no. of times?โ€ข Whose times_of_weeks is lower than somebody elseโ€™s for a given

bar

42

โˆ’

๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ ๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ 

๐œŒ๏ฟฝ" ๐œŒ๏ฟฝ#

โ‹ˆ๏ฟฝ".sL&HY๏ฟฝut๏ฟฝ๏ฟฝHHm7๏ฟฝ#.sL&HY๏ฟฝut๏ฟฝ๏ฟฝHHmโˆง๏ฟฝ".ABCD๏ฟฝ#.ABC

๐œ‹๏ฟฝ".ABC,๏ฟฝ".JCL$mHC

A deeper question:When (and why) is โ€œโˆ’โ€ needed?

Frequents(drinker, bar, times_of_week)Bar(name, address)Drinker(name, address)

๐œ‹ABC,JCL$mHC

๐น๐‘Ÿ๐‘’๐‘ž๐‘ข๐‘’๐‘›๐‘ก๐‘ 

Page 43: Relational Model and Algebra - Duke University

Monotone operators

โ€ข If some old output rows may need to be removedโ€ข Then the operator is non-monotone

โ€ข Otherwise the operator is monotoneโ€ข That is, old output rows always remain โ€œcorrectโ€ when more rows

are added to the input

โ€ข Formally, for a monotone operator ๐‘œ๐‘:๐‘… โŠ† ๐‘…r implies ๐‘œ๐‘ ๐‘… โŠ† ๐‘œ๐‘ ๐‘…r for any ๐‘…, ๐‘…r

43

RelOpAdd more rows

to the input...

What happensto the output?

Page 44: Relational Model and Algebra - Duke University

Which operators are non-monotone?

โ€ข Selection: ๐œŽ/๐‘…โ€ข Projection: ๐œ‹\๐‘…โ€ข Cross product: ๐‘…ร—๐‘†โ€ข Join: ๐‘… โ‹ˆ/ ๐‘†โ€ข Natural join: ๐‘… โ‹ˆ ๐‘†โ€ข Union: ๐‘… โˆช ๐‘†โ€ข Difference: ๐‘… โˆ’ ๐‘†โ€ข Intersection: ๐‘… โˆฉ ๐‘†

44

MonotoneMonotoneMonotoneMonotoneMonotoneMonotoneMonotone w.r.t. ๐‘…; non-monotone w.r.t ๐‘†

Monotone

Page 45: Relational Model and Algebra - Duke University

Why is โ€œโˆ’โ€ needed for โ€œhighestโ€?

โ€ข Composition of monotone operators produces a monotone queryโ€ข Old output rows remain โ€œcorrectโ€ when more rows are

added to the input

โ€ข Is the โ€œhighestโ€ query monotone?โ€ข No!โ€ข Current highest price 3.0โ€ข Add another row with price 3.01โ€ข Old answer is invalidated

FSo it must use difference!

45

Page 46: Relational Model and Algebra - Duke University

Extensions to relational algebra

โ€ข Duplicate handling (โ€œbag algebraโ€)โ€ข Grouping and aggregationโ€ข โ€œExtensionโ€ (or โ€œextended projectionโ€) to allow

new column values to be computed

โ€ข (Coming later)

46