Translation of ERD to Relational Scheme, Relational Algebra Prof. Sin-Min LEE Department of Computer...

Post on 26-Dec-2015

223 views 0 download

Tags:

Transcript of Translation of ERD to Relational Scheme, Relational Algebra Prof. Sin-Min LEE Department of Computer...

Translation of ERD to Relational Scheme,Relational Algebra

Prof. Sin-Min LEE

Department of Computer Science

•Relation schema–Named relation defined by a set of attribute and domain name pairs.

•Relational database schema–Set of relation schemas, each with a distinct name.

•Each tuple is distinct; there are no duplicate tuples.•Order of attributes has no significance.•Order of tuples has no significance, theoretically.•Relation name is distinct from all other relation names in relational schema.•Each cell of relation contains exactly one atomic (single) value.•Each attribute has a distinct name.•Values of an attribute are all from the same domain.

Database Scheme

A relational database scheme, or schema, corresponds to a set of table definitions.

Eg: product(p_id, name, category, description) supply(p_id, s_id, qnty_per_month) supplier(s_id, name, address, ph#)

* remember the difference between a DB instance and a DB scheme.

SAMPLE SCHEMAS AND INSTANCESThe Schemas:

Sailors(sid: integer, sname: string, rating: integer, age: real)

Boats(bid: integer, bname: string, color: string)Reserves(sid: integer, bid: integer, day: date)

The Instances:

The entity-relationship model

The entity-relationship (ER) data model was developed by Peter Chen in the 1970s.

As its name implies, the ER model provides a systematic representation of the application entities and their relationships.

ER diagrams provide a notation for documenting a tentative database design. You capture the important application

The entity-relationship model

features with ER diagrams, which you then translate into a specific database schema.

ER notation represents application entities as rectangles, surrounded by their attributes. An alternative format lists the attributes inside the rectangle, with certain conventions for special kinds of attributes. For example, a key attribute is underlined,

The entity-relationship model

and a derived attribute is marked with an asterisk. ER diagrams express relationships as diamonds with connecting lines to the participating entities. The lines can be single, indicating partial participation, or double, indicating total participation of the connected entity.

The entity-relationship model is useful because, it facilitates communication between the database designer and the end user during the requirements analysis. To facilitate such communication the model has to be simple and readable for the database designer as well as the end user. It enables an abstract global view (that is, the enterprise conceptual schema) of

the enterprise to be developed without any consideration of efficiency of the ultimate database.

The entity-relationship model

A weak entity concept for a situation where the entity’s objects aren’t meaningful as isolated in the database. It is meaningful only when related to a dominant object, through a many-to-one or one-to-one relationship. The participation of a weak entity must be total.

Class hierarchies and inheritance

Although the original entity-relationship model didn’t include class hierarchies and lattices, later enhancement added these facilities. These features are useful for database deployment to an object-oriented setting.

A superclass-subclass arrangement among the application entities captures more of the application’s meaning.

Class specialization and generalization

The process that derives subclasses from a superclass is specialization.

The process that abstracts from a collection of specialized subclasses to a superclass is generalization.

Passing from a superclass to a subclass is a specialization to a more detailed description. Rising from a subclass to a

Class specialization and generalization

superclass is a generalization to a more generic description.

Subclasses inherit the attributes and operations of their superclasses. A subclass specialization can be total, meaning that you always view a superclass object as a generalized view of one or more subclasses. Or it can be partial, meaning that a

Class specialization and generalization

superclass object can exist without further specialization. To indicate total specialization, you double the connecting line impinging on the superclass.

Overlapping subclasses and multiple inheritance

Independent of its total-partial status, a specialization can be disjoint or overlapping.

In a disjoint specialization, an object that is further classified must belong to exactly one of the subclasses.

Overlapping specialization occurs when an object belongs simultaneously to two or

Overlapping subclasses and multiple inheritance

more subclasses.A thorny issue, which causes difficulties

when mapping an ER diagram to a database schema, is multiple inheritance.

In multiple inheritance, a class inherits attributes from several superclasses. Multiple inheritance and overlapping specialization frequently occur together

Overlapping subclasses and multiple inheritance

because the subclasses of the overlapping specialization can rejoin deeper in the lattice in a multiple inheritance situation.

Mapping from ER diagrams to database schemas

You can map an ER diagram to any of the five database schemas (relational, network, hierarchical, object-oriented, and deductive database).

However, the mappings frequently compromise distinctions documented in the diagram. ER-to-relational mapping occurs most frequently.

Mapping from ER diagrams to a relational schema

Transforming an ER diagram into a relational database schema proceeds as follows:- You examine each attribute, x, of each entity, E, for the following special cases

- If x is a composite attribute, you flatten it. You retain only the leaf entries of the hierarchical structure.

- If x is a derived attribute, you remove it. You can make it available through a view after

Mapping from ER diagrams to a relational schema

you have established the database schema.

- If x is a multivalued attribute, you create a new weak entity, E x, and a one-to-many relationship between E and E x. You move the multivalued attribute to E x.- You decompose many-to-many and higher-degree relationships into collections of one-to-many relationships. You transfer any relationship attributes to the intersection entity.

Mapping from ER diagrams to a relational schema

- You fold any relationship attributes attached to one-to-many relationships into the dependent entity.- You give key attributes to all dominant entities if they aren’t already present.- You give a foreign key to the dependent entity of each one-to-many relationship; you use the same name as the key in its dominant partner.

Mapping from ER diagrams to a relational schema

- You prepare a create-table SQL statement for each entity of the transformed diagram and use the appropriate clauses to enforce key constraints, referential integrity, total participation, and weak entities.- Anticipating frequent joins across the established relationships, you index all primary and foreign keys--if the database product allows explicit index creation.

Mapping from ER diagrams to a relational schema

In summary, you translate an ER diagram into a relational schema by: 1. Transforming multivalued and composite attributes.2. Decomposing many-to-many and higher-degree relationships.3. Introducing keys and foreign keys. Each entity then becomes a relation.

Mapping from ER diagrams to a relational schema

So, if there is no subclass structure, the mapping proceeds by flattening composite attributes, removing derived attributes, exporting multivalued attributes to weak entities, and factoring all many-to-many and higher-degree relationships through an intersection entity. You then transfer relationship attributes to the intersection entity, or you fold them into the dependent side of a one-to-X relationship.

Mapping from ER diagrams to a relational schema

All dominant entities receive primary keys, and all dependent entities receive foreign keys of the same name. Each entity then becomes a relation.

If subclass specializations are present, you must remove them before the transformation.Some methods are available to reduce superclass-subclass specializations from an ER diagram before mapping to a relational schema.

Mapping from ER diagrams to a relational schema

1. The most semantically damaging approach collapses a specialization into the superclass, providing slots for all possible attributes and flags to mark the suppressed subclasses.2. A less drastic alternative repeats the superclass attributes in the subclasses, but this approach is available only when the specialization is total.

Mapping from ER diagrams to a relational schema

3. A final possibility converts subclasses to full-fledged entities and connects them to the superclass with restricted relationships.

- All three techniques share the disadvantage of masking semantic nuances in the ER diagram.

What is Relational Algebra?

Relational algebra is a procedural query language.

It consists of the select, project, union, set difference, Cartesian product, and rename operations.

Set intersection, division, natural join, and assignment combine the fundamental operations.

SQL is based on relational algebra

Query languages

procedural vs. non-proceduralcommercial languages have some

of bothwe will study:

relational algebra (which is procedural, i.e. tells you how to process a query)

relational calculus (which is non-procedural i.e. tells what you want)

Introduction

one of the two formal query languages of the relational model

collection of operators for manipulating relations

Operators: two types of operators Set Operators: Union(),Intersection(),

Difference(-), Cartesian Product (x) New Operators: Select (), Project (),

Join (⋈)

Introduction – cont’d

A Relational Algebra Expression: a sequence of relational algebra operators and operands (relations), formed according to a set of rules.

The result of evaluating a relational algebra expression is a relation.

Selection

Denoted by c(R)

Selects the tuples (rows) from a relation R that satisfy a certain selection condition c.

It is a unary operatorThe resulting relation has the same

attributes as those in R.

Example 1:

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

S:

state=‘IL’(S)

Example 2:

CNOCNAME

CREDIT

DEPT

C1Database

3 CS

C2Statistics

3 MATH

C3 Tennis 1SPORTS

C4 Violin 4 MUSIC

C5 Golf 2SPORTS

C6 Piano 5 MUSIC

C:

CREDIT 3(C)

Example 3

SNO CNO Grade

S1 C1 90

S1 C2 80

S1 C3 75

S1 C4 70

S1 C5 100

S1 C6 60

S2 C1 90

S2 C2 80

S3 C2 90

S4 C2 80

S4 C4 85

S4 C5 100

E:

SNO=‘S1’and CNO=‘C1’(E)

Selection - Properties

Selection Operator is commutative C1(C2 (R)) = C2(C1 (R))

The Selection is an unary operator, it cannot be used to select tuples from more than one relations.

Projection

Denoted by L(R), where L is list of attribute names and R is a relation name or some other relational algebra expression.

The resulting relation has only those attributes of R specified in L.

The projection is also an unary operation.  Duplicate rows are not permitted in relational

algebra. Duplication is removed from the result.

Duplicate rows can occur in SQL, though they may be controlled by explicit keywords.

Projection - Example

Example 1: STATE (S)

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

STATE

IL

LA

CA

NY

Projection - Example

Example 2: CNAME, DEPT(C)

CNOCNAME

CREDIT

DEPT

C1Database

3 CS

C2Statistics

3 MATH

C3 Tennis 1SPORTS

C4 Violin 4 MUSIC

C5 Golf 2SPORTS

C6 Piano 5 MUSIC

CNAME

DEPT

Database

CS

Statistics

MATH

TennisSPORTS

Violin MUSIC

GolfSPORTS

Piano MUSIC

Projection - Example

Example 3: S#(STATE=‘NY'(S))

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

SNO

S4

S5

SET Operations

UNION: R1 R2

INTERSECTION: R1 R2

DIFFERENCE: R1 - R2

CARTESIAN PRODUCT: R1 R2

Union Compatibility

For operators , , -, the operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must have the same number of attributes, and the domains of the corresponding attributes must be compatible; that is, dom(Ai)=dom(Bi) for i=1,2,...,n.

The resulting relation for , , or - has the same attribute names as the first operand relation R1 (by convention).

Union Compatibility - Examples

Are S(SNO, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT, DEPT) union compatible?

Are S(S#, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT_HOURS, DEPT_NAME) union compatible?

UNION, SET DIFFERENCE & SET INTERSECT

Union puts all tuples of two relations in one relation. To use this operator, two conditions must hold:1. The two relations must be of the same arity.2. The domain of ith attribute of the two participating relation must

be the same.

Set difference operator computes tuples that are in one relation, but not in another.

Set intersect operator computes tuples that are common in two relations:

The five fundamental operations of the relational algebra are: select, project, cartesian product, Union, and set difference

All other operators can be constructed using these operators

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the bid of red colored boats:

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the bid of red colored boats: ∏bid(бcolor=red(Boats))

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved Boat number 2.

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved Boat number 2. ∏sname(бbid=2(Sailors (sid)Reserve))

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved both a red and a green boat.

Union, Intersection, Difference

T= R U S : A tuple t is in relation T if and only if t is in relation R or t is in relation S

T = R S: A tuple t is in relation T if and only if t is in both relations R and S

T= R - S :A tuple t is in relation T if and only if t is in R but not in S

Set-Intersection

Denoted by the symbol . Results in a relation that contains

only the tuples that appear in both relations.

R S = R – (R – S)Since set-intersection can be written

in terms of set-difference, it is not a fundamental operation.

Examples

A1 A2

1 Red

3 White

4 green

B1 B2

3 White

2 Blue

R S

Examples

A1 A2

1 Red

3 White

4 Green

2 Blue

A1 A2

3 White

R SR S

S - R

B1 B2

2 Blue

A1 A2

1 Red

4 Green

R - S

RENAME OPERATOR

Rename operator changes the name of its input table to its subscript, ρe2(Emp) Changes the name of Emp table to e2

EXAMPLE

Emp table:

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

EXAMPLEEmp table:

бSalary=30,000(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

EXAMPLEEmp table:

бSalary=30,000(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

4 Kathy 30 30000 5

EXAMPLEEmp table:

бAge>22(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

EXAMPLEEmp table:

бAge>22(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

1 Joe 24 20000 2

4 Kathy 30 30000 5

PROJECT OPERATOR

Project (∏) retrieves a column. It is a unary operator that eliminate duplicates.

e.g., name of employees: ∏ name(Employee)

e.g., name of employees earning more than 30,000:

∏ name(бSalary>30,000(Employee))

EXAMPLE

Emp table:

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

EXAMPLEEmp table:

∏ age(Emp)

SS# Name Age Salary

dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Age

24

20

22

30

4

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

5 Shideh 4 4000 1

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Name Age

Shideh 4