RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance...

60
RELATIONAL ALGEBRA SECTION 5 An Introduction

Transcript of RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance...

Page 1: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

RELATIONAL ALGEBRA

SECTION 5

An Introduction

Page 2: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Introduction

• E.F. Codd in 1970

• A revolutionary advance in data manipulation

• Very significant. Why?

Page 3: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Manipulation of data solely on their logical characteristics

• Codd proposed two data manipulation languages

• Change in terminology emphasized logical meaning

Page 4: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

– procedural

• Relational calculus– nonprocedural

• Relational algebra

Page 5: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Relational Algebra

• Manipulation of relations

• Creation of new relations

• Consists of nine operations:– union, intersection, difference, product, select, project, join, divide, and assignment

Page 6: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Union ()

• Consider the two following relations:

– Representations of sales person

– SALES_SUBORD subordinate salespeople– SALES_MGR sales managers

• Combine data from two relations

Page 7: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALES_SUBORD

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

37 Elena Hermana 12 B. A. 13

39 Goro Azuma 44 Tokyo 10

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

SALES_MGR

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

Page 8: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

If A = SALES_SUBORD and B = SALES_MGR

What is A B?

A B

Page 9: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALESPERSON

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

37 Elena Hermana 12 B. A. 13

39 Goro Azuma 44 Tokyo 10

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

• Union compatible

• SALESPERSON:= SALES_SUBORD SALES_MGR

Page 10: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Intersection ()

• Using the original two union-compatible tables

• Allows the identification of rows that are common to two relations

Page 11: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

If A = SALES_SUBORD and B = SALES_MGR

What is A B?

A B

Page 12: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALES_SUBORD_MGR

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

• If C is the intersection of A and B, C := A B

• What does C consist of?

Page 13: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Difference (–)

• Again using the original two-union compatible tables

• SALES_MGR_MGR := SALES_ MGR – SALES_ SUBORD

• Creates the set difference of two union-compatible relations

Page 14: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

If A = SALES_SUBORD and B = SALES_MGR

What is A - B?

A B

Page 15: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALES_MGR_MGR

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

27 Terry Cardon Chicago 15

• If C is the difference between A and B, C := A – B

• Then?

Page 16: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Consider SALES_SUBORD – SALES_ MGR

• What does this give?

• Note:

A – B B – A

A B

Page 17: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Product ()

• Creates the Cartesian product of two relations

C := A B

X Y W Z

10 22 33 54

10 22 37 98

10 22 42 100

11 25 33 54

11 25 37 98

11 25 42 100

A

X Y

10 22

11 25

B

W Z

33 54

37 98

42 100

Page 18: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Consider the following example of a PRODUCT table and a SALE table

• Want PS := PRODUCT SALE

• Note:– Adjoins the attributes of the two relations

– Attaches to each row in A, each of the rows in B

Page 19: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

PRODUCT

PROD ID PROD DESC MANUFACTR ID

COST PRICE

1035 Sweater 210 12.50 20.00

2241 Table Lamp 317 22.50 32.50

2249 Table Lamp 317 35.50 48.00

2518 Brass Sculpture 253 11.00. 22.00

SALE

DATE CUST ID SALPERS ID PROD ID QTY

02/28 100 10 2241 200

02/12 101 23 2518 300

02/15 101 23 1035 150

02/19 100 39 2518 200

02/02 101 23 1035 200

02/05 105 10 2241 100

02/22 110 37 2518 150

02/14 105 10 2249 50

02/01 101 23 2249 75

02/04 101 23 2241 250

Page 20: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Any problems?

• Any obvious application for a product operation?

• How many columns and rows would PS have?

Page 21: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

PROD ID

PROD DESC MANUFACTR ID

COST PRICE

1035 Sweater 210 12.50 20.00

1035 Sweater 210 12.50 20.00

1035 Sweater 210 12.50 20.00

2241 Table Lamp 317 22.50 32.50

2241 Table Lamp 317 22.50 32.50

2241 Table Lamp 317 22.50 32.50

DATE CUST ID SALPERS ID

PROD ID

QTY

02/28 100 10 2241 200

02/12 101 23 2518 300

02/15 101 23 1035 150

02/28 100 10 2241 200

02/12 101 23 2518 300

02/15 101 23 1035 150

Page 22: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Select

• Query:– Give all information for salespeople in the Tokyo office.

• SALES_TOKYO := SELECT (SALESPERSON: OFFICE = ‘TOKYO’)

• Creates a relation from another relation by selecting only those rows that satisfy a given condition

Page 23: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALES_TOKYO

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

14 Masaji Matsu 44 Tokyo 11

39 Goro Azuma 44 Tokyo 10

44 Albert Ige 27 Tokyo 12

• Essentially an ‘IF’ statement

• Other examples of a SELECT– SALPERS_ID = 23– SALPERS_NAME = ‘Brigit Bovary’– MANAGER_ID >= 20– OFFICE not = ‘B.A.’– COMM_% < 11

Page 24: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Which salesperson has ID 23?

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

23 Francois Moire 35 Brussels 9

Solution: SELECT (SALESPERSON: SALPERS_ID = 23)

• Query:– Give all information about salesperson Brigit Bovary

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

35 Brigit Bovary 27 Brussels 11

Solution: SELECT (SALESPERSON: SALPERS_NAME = ‘Brigit Bovary’)

Page 25: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Who are the sales people working for managers having an ID greater

than 20?

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

39 Goro Azuma 44 Tokyo 10

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

Solution: SELECT (SALESPERSON:MANAGER_ID > 20)

Page 26: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Give information on all salespeople except those in the Buenos Aries

office.

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

39 Goro Azuma 44 Tokyo 10

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

Solution: SELECT (SALESPERSON:OFFICE not = ‘B.A.’)

Page 27: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Who are the salespeople in Tokyo getting more thn 10% commission?

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

14 Masaji Matsu 44 Tokyo 11

44 Albert Ige 27 Tokyo 12

Solution: SELECT (SALESPERSON:OFFICE = ‘Tokyo’ and COMM_% > 10)

Page 28: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Who is reporting to manager 27 or getting over 10% commission?

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

37 Elena Hermana 12 B. A. 13

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

Solution: SELECT (SALESPERSON:MANAGER_ID = 27 or COMM_% > 10)

Page 29: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Project

• A projection

• Used to eliminate unwanted columns

SALES_TOKYO

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

14 Masaji Matsu 44 Tokyo 11

39 Goro Azuma 44 Tokyo 10

44 Albert Ige 27 Tokyo 12

• Who are the salespeople in the Tokyo office?

Page 30: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Solution: SALES_TOKYO [SALPERS_NAME]

SALPERS NAME

Masaji Matsu

Goro Azuma

Albert Ige

• You can project any number of columns

E.g. SALES_TOKYO [SALESPERS_ID, SALPERS_NAME, MANAGER_ID]

SALPERS ID

SALPERS NAME MANAGER ID

14 Masaji Matsu 44

39 Goro Azuma 44

44 Albert Ige 27

Page 31: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Another Example: SALESPERSON [COMM_%]

COMM %

10

11

9

13

15

12

• What is noticeable in this new relation?

Page 32: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Query:– Which salespeople are getting less than 11% commission?

SALPERS NAME

Rodney Jones

Francois Moire

Goro Azuma

Buster Sanchez

Solution: SELECT (SALESPERSON:COMM% < 11) [SALPERS_NAME]

Page 33: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Join

• The most important function in any database language

• Natural join– Connects relations when common columns have equal values

• Used to connect data across a number of relations

Page 34: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

CUSTOMER

CUST ID

CUST NAME ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

An example:

Logical connections between SALESPERSON, PRODUCT, and CUSTOMER

•Query:–What are the names of the customers who have made purchases from salesperson 10?

Page 35: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

SALE

DATE CUST ID SALPERS ID PROD ID QTY

02/28 100 10 2241 200

02/12 101 23 2518 300

02/15 101 23 1035 150

02/19 100 39 2518 200

02/02 101 23 1035 200

02/05 105 10 2241 100

02/22 110 37 2518 150

02/14 105 10 2249 50

02/01 101 23 2249 75

02/04 101 23 2241 250

•The SALE relation

Page 36: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Solution:1. Select from SALES those sales belonging to salesperson 10

SALE _SP10

DATE CUST ID SALPERS ID PROD ID QTY

02/28 100 10 2241 200

02/05 105 10 2241 100

02/14 105 10 2249 50

2. Join SALE_SP10 and CUSTOMER

Syntax JOIN (SALE_SP10, CUSTOMER)

• The operation proceeds as follows

Page 37: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

1. The product of SALE_SP10 and CUSTOMER is created

SALE _SP10

DATE CUST ID SALPERS ID PROD ID QTY

02/28 100 10 2241 200

02/05 105 10 2241 100

02/14 105 10 2249 50

CUSTOMER

CUST ID

CUST NAME ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

• Want SALE_SP10 CUSTOMER

Page 38: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

DATE CUST ID

SALPERS ID

PROD ID

QTY

02/28 100 10 2241 200

02/28 100 10 2241 200

02/28 100 10 2241 200

02/28 100 10 2241 200

02/05 105 10 2241 100

02/05 105 10 2241 100

02/05 105 10 2241 100

02/05 105 10 2241 100

02/14 105 10 2249 50

02/14 105 10 2249 50

02/14 105 10 2249 50

02/14 105 10 2249 50

CUST ID

CUST NAME

ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

The product of SALE_SP10 and CUSTOMER

Page 39: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

DATE CUST ID

SALPERS ID

PROD ID

QTY

02/28 100 10 2241 200

02/28 100 10 2241 200

02/28 100 10 2241 200

02/28 100 10 2241 200

02/05 105 10 2241 100

02/05 105 10 2241 100

02/05 105 10 2241 100

02/05 105 10 2241 100

02/14 105 10 2249 50

02/14 105 10 2249 50

02/14 105 10 2249 50

02/14 105 10 2249 50

CUST ID

CUST NAME

ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

101 Maltzl Salzburg Austria 75,314 77,200

105 Jefferson Box 918, Chicago USA 49,333 57,811

110 Gomez Santiago Chile 27,400 35,414

Notice that the customer id’s do not match. Thus these are invalid rows and must be deleted. This leads to step 2.

Page 40: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

2. Eliminate all rows except those in which CUST_ID from SALE_SP10 is equal to CUST_ID from CUSTOMER

DATE CUST ID

SALPERS ID

PROD ID

QTY

02/28 100 10 2241 200

02/05 105 10 2241 100

02/14 105 10 2249 50

CUST ID

CUST NAME

ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

100 Watabe Bros Box 241, Tokyo Japan 45,551 52,113

105 Jefferson Box 918, Chicago USA 49,333 57,811

105 Jefferson Box 918, Chicago USA 49,333 57,811

Page 41: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

3. Eliminate one of the CUST_ID columns from this new relation

DATE CUST ID

SALPERS ID

PROD ID

QTY

02/28 100 10 2241 200

02/05 105 10 2241 100

02/14 105 10 2249 50

CUST NAME

ADDRESS COUNTRY BEGINNING

BALANCE

CURRENT

BALANCE

Watabe Bros Box 241, Tokyo Japan 45,551 52,113

Jefferson Box 918, Chicago USA 49,333 57,811

Jefferson Box 918, Chicago USA 49,333 57,811

Page 42: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

1. Take the product of A and B. The resulting relation will have two columns for each of C1 ….. Cn

2. Eliminate all rows from the product except those on which the values of the columns C1 ….. Cn in A are equal, respectively, to the values of those in columns in B.

3. Eliminate duplicate columns.

4. Project out one copy of the columns C1 ….. Cn

• General definition of a natural join of two relations A and B which have columns C1 ….. Cn in common.

Page 43: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

•Query:–What is the name of the customer involved in each sale?

Solution: A := CUSTOMER [CUST_ID, CUST_NAME]

B := JOIN (SALE, A)

B

DATE SALPERS ID PROD ID QTY CUST ID CUST NAME

02/28 10 2241 200 100 Watabe Bros

02/12 23 2518 300 101 Maltzl

02/15 23 1035 150 101 Maltzl

02/19 39 2518 200 100 Watabe Bros

02/02 23 1035 200 101 Maltzl

02/05 10 2241 100 105 Jefferson

02/22 37 2518 150 110 Gomez

02/14 10 2249 50 105 Jefferson

02/01 23 2249 75 101 Maltzl

02/04 23 2241 250 101 Maltzl

Page 44: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

•Query:

Gives the names of customers who have purchased product 2518.

Solution: A := SELECT (SALE: PROD_ID = 2518)

B := JOIN (A, CUSTOMER) [CUST_NAME]

B

CUST NAME

Maltzl

Watabe Bros

Gomez

Page 45: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

•Query:

Who has bought table lamps?

Solution:

A := SELECT (PRODUCT: PROD_DESC = ‘Table Lamp’)

B := JOIN (A, SALE)

C := JOIN (B, CUSTOMER) [CUST_NAME]

C

CUST NAME

Watabe Bros

Jefferson

Maltzl

Page 46: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

MANUFACTURER

MANUFACTR ID MANUFACTR NAME

ADDRESS COUNTRY

210 Kiwi Klothes Auckland New Zealand

253 Brass Works Lagos Nigeria

317 Llama Llamps Lima Peru

• Add another table: The manufacturer of the products

Page 47: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

•Query:

Which salespeople have sold products manufactured in Peru?

Solution:

A := SELECT (MANUFACTURER: COUNTRY = ‘Peru’)

B := JOIN (A, PRODUCT)

C := JOIN (B, SALE)

D := JOIN (C, SALESPERSON) [SALPERS_NAME]

D

SALPERS NAME

Rodney Jones

Francois Moire

Page 48: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Theta Join

– Query: Identify salespeople whose manager gets a commission rate exceeding 11%.

• Consider the following:

SALESPERSON

SALPERS ID

SALPERS NAME MANAGER ID OFFICE COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

37 Elena Hermana 12 B. A. 13

39 Goro Azuma 44 Tokyo 10

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

Page 49: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Cannot be solved by a simple select operation

• Have to join a relation to itself. How?

• Have to match SALPERS_ID to MANAGER_ID

Page 50: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Create two copies of SALESPERSON– SP1 := SALESPERSON– SP2 := SALESPERSON

• Then:A := JOIN (SP1, SP2: SP1.MANAGER_ID = SP2.SALESPERS_ID)

• Have a new version of a JOIN– The Theta Join

Page 51: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• The result

SP1.

SALPERS ID

SP1.SALPERS NAME

SP1.

MANAGER ID

SP1.

OFFICE

SP1.

COMM %

10 Rodney Jones 27 Chicago 10

14 Masaji Matsu 44 Tokyo 11

23 Francois Moire 35 Brussels 9

37 Elena Hermana 12 B. A. 13

39 Goro Azuma 44 Tokyo 10

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

SP2.

SALPERS ID

SP2.SALPERS NAME

SP2.

MANAGER ID

SP2.

OFFICE

SP2.

COMM %

27 Terry Cardon Chicago 15

44 Albert Ige 27 Tokyo 12

35 Brigit Bovary 27 Brussels 11

12 Buster Sanchez 27 B. A. 10

44 Albert Ige 27 Tokyo 12

27 Terry Cardon Chicago 15

27 Terry Cardon Chicago 15

27 Terry Cardon Chicago 15

A

Page 52: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Whose manager gets a commission rate exceeding 11%?B := SELECT (A: SP2.COMM_% > 11) [SP1.SALPERS_NAME]

• Not yet finished

B

SP1.SALPERS NAME

Rodney Jones

Masaji Matsu

Goro Azuma

Albert Ige

Brigit Bovary

Buster Sanchez

Page 53: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• The comparison operators:=, not =, <, >, <=, >=

• General form:– JOIN (A, B: X δ Y)

• The Theta Join is a join with a specified condition involving a column from each relation

Page 54: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Divide

• Query:– List salespeople who have sold every product

• Thus for each product there must be at least one row in SALE containing the SALE_ID of that salesperson

• Selects rows in one relation that match every row in another relation

Page 55: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Different requirement here

• Have to look at the whole relation

• Key word is every

Page 56: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Using the example relations

• Obtain a relation consisting of all product ids– PI := PRODUCT [PROD_ID]

• Need instances of salesperson and product– PISI := SALE [PROD_ID, SALPERS_ID]

Page 57: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• Result: Two projections

PI

PROD ID

1035

2241

2249

2518

PISI

SALPERS

ID

PROD ID

10 2241

23 2518

23 1035

39 2518

37 2518

10 2249

23 2249

23 2241

Page 58: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• A := PISI / PI

• What do we want from these relations now?

A

SALPERS

ID

23

Page 59: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

• General Description of a Divide

• Assume A, B, and C are relations where B/C = A

1. Columns of C must be a subset of B

2. The columns of A are all and only those columns of B that are not columns of C

Page 60: RELATIONAL ALGEBRA SECTION 5 An Introduction. Introduction E.F. Codd in 1970 A revolutionary advance in data manipulation Very significant. Why?

Assignment

• Have been using it all the time

A := SELECT (SALESPERSON: COMM_% > 11)

• An operation that gives a name to a relation