Ch 4: Relational Database Design

38
Ch 4: Relational Database Design

description

Ch 4: Relational Database Design. 4.1 Features of Good Relational designs. Four Informal measures Semantics of the relation attributes Reducing the redundant values in tuples. Reducing the null values Disallowing the possibility of generating spurious(wrong) tuples. - PowerPoint PPT Presentation

Transcript of Ch 4: Relational Database Design

Page 1: Ch 4: Relational Database Design

Ch 4: Relational Database Design

Page 2: Ch 4: Relational Database Design

4.1 Features of Good 4.1 Features of Good Relational designsRelational designs

Four Informal measures

•Semantics of the relation attributes

•Reducing the redundant values in tuples.

•Reducing the null values

•Disallowing the possibility of generating spurious(wrong) tuples

Page 3: Ch 4: Relational Database Design

1. 1. Semantics of the relation attributes

• Design a realm schema so that it is easy to explain its meaning .Do not combine attributes from multiple entity types & relationship into single relation.

• In general ,the easier it is to explain the semantics of relation, the better the relation schema design will be.

Page 4: Ch 4: Relational Database Design

2. Redundant Information in Tuples & 2. Redundant Information in Tuples & Update AnomaliesUpdate Anomalies

• Goal of dbase is to reduce storage space used by relationship.

• Grouping attributes into relation schemas has a significant effect on storage space.

• For e.g. if we combine Employee with department & project , works_on will result into EMP_DEPT & EMP_PRJ.

• Resultant relation shows repetition of several values leading to higher storage.

• Other serious problems is of Update anomalies which is classified as insert, delete & modification anomalies.

Page 5: Ch 4: Relational Database Design

Insertion Anomalies Insertion Anomalies

• Insert a new emp , we must include value for dept or need to place NULL (if emp doesnot work for dept yet.) & need to enter correctly so consistency problem donot occur.

• It is difficult to enter new dept that has no employee as we cannot insert Null in ENO as its is primary key

Page 6: Ch 4: Relational Database Design

Deletion Anomalies Deletion Anomalies

• If we delete from EMP_DEPT an employee that happens to represent the last employee in that dept , the info abt dept is also lost from dbase

Page 7: Ch 4: Relational Database Design

Modification Anomalies Modification Anomalies

• If we change the value of one of attributes say that of manager of dept 5 , we need to change in each tuple where dept no is 5, else it will lead to inconsistent.

• Design Dbase so that no insertion, deletion & modification anomalies are present .

Page 8: Ch 4: Relational Database Design

3. Null values in tuples3. Null values in tuples

• NULL have multiple interpretations such as:

• Attributes that do not apply to this tuple.

• Attribute value for this tuple is unknown.

• Value is known but absent, i.e. it has not been recorded yet.

• Problem may occur in JOIN , and aggregrate operations.

Page 9: Ch 4: Relational Database Design

4. Generation of Spurious Tuples4. Generation of Spurious Tuples

• Design relation schemas so that they can be joined with equality condition on attributes that are primary or foreign key.

Page 10: Ch 4: Relational Database Design

4.2 Functional Dependencies (FD)

• A functional dependency, denoted by X Y (Read X functionally determines Y), between two sets of attributes X and Y that are subsets of R specifies a constraint on possible tuples that can be form a relation state r of R.

• The constraint is that for for all pairs of tuples t1 and t2 in r such that

t1 [X ] = t2 [X ] , they must also have,

t1 [Y] = t2 [Y]. In other words

Whenever two tuples of r agree on their X value, they also agree on their Y value.

Page 11: Ch 4: Relational Database Design

Functional Dependency

• Main concept associated with normalization.• Functional Dependency

– Describes relationship between attributes in a relation.

– If A and B are attributes of relation R, B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R.

Page 12: Ch 4: Relational Database Design

Functional Dependency

• Diagrammatic representation:

Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow.

Page 13: Ch 4: Relational Database Design

Example - Functional Dependency

Page 14: Ch 4: Relational Database Design

Functional Dependencies (Cont.)

• K is a superkey for relation schema R if and only if K R

• K is a candidate key for R if and only if

– K R, and

– for no K, R

• Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:

bor_loan = (customer_id, loan_number, amount ).

We expect this functional dependency to hold:

loan_number amount

but would not expect the following to hold:

amount customer_name

Page 15: Ch 4: Relational Database Design

Functional Dependencies (Cont.)

• Main use of FD is to describe further relation schema R by specifying constraints on its attributes that must hold all times.

• Certain FD can be specified without referring to specific relation.

• {state, driving_licence} ENO• {pincode) area• {telephone code}city

Page 16: Ch 4: Relational Database Design

Functional Dependencies (Cont.)

ENOEname

Pnumber{pname,plocation}

{eno,pnumber}hours• Eno uniquely determines emp name• Pnumber uniquely determines project name & location• Combination of eno , pnumber uniquely determines

numbers of hours that employee had worked on that project.

• FD plays a key role in differentiating good DB design from bad DB design.

Page 17: Ch 4: Relational Database Design

EXAMPLE :: TEACH

TEACHER COURSE TEXT

GUPTA CHEMISTRY SAHANI

GUPTA MATHS NAVATHE

KUMAR BIOLOGY HOFFMAN

GOYAL CHEMISTRY KAHATE

Possible FD’s

TEXTCOURSE hold

But,

TEACHERCOURSE is ruled out.

Page 18: Ch 4: Relational Database Design

Use of Functional Dependency

• To test relations to see whether they are legal under a given set of functional dependencies. If a relation r is legal under a set F of FD’s, we say that r satisfies F.

• To specify constraints on the set of legal relations. If we wish to constrain ourselves to relations on schema R that satisfy a set F of functional dependencies, we say that F holds on K.

Page 19: Ch 4: Relational Database Design

FD – A Few More Examples

• Suppose one is designing a system to track vehicles and the capacity of their engines. Each vehicle has a unique vehicle identification number (VIN). One would write VIN → EngineCapacity because it would be inappropriate for a vehicle's engine to have more than one capacity. (Assuming, in this case, that vehicles only have one engine.)

• However, EngineCapacity → VIN, is incorrect because there could be many vehicles with the same engine capacity.

Page 20: Ch 4: Relational Database Design

Trivial and Non Trivial FD

• Trivial FD : A FD X->Y is trivial if Y, the right hand side of the functional dependency is a subset of X.

• Eg. : A FD • {EmpID, EmpAddress}->{EmpAddress} is trivial, as

{EmpAddress} is a subset of {EmpID,EmpAddress}.• Non Trivial FD : A FD is called Nontrivial if Y is not a

subset of X.• Eg. : A FD• {EmpID,EmpAddress}->{EmpPhone} is non trivial, as

{EmpPhone} is not subset of {EmpID,EmpAddress}

Page 21: Ch 4: Relational Database Design

Closure

• The set of all FDs that include F as well as all dependencies that are implied by a given set F of FDs is called the closure of F, denoted by F+.

• F= ENO{ ENAME,DOB,ADDRESS,DNUM}

DNUMBER{DNAME,MGRNO}

SOME ADDITIONAL FD’S ARE

ENO DNAME,MGRNO

DUMBERDNAME

Page 22: Ch 4: Relational Database Design

Inference rules for FD Or AXIOMS

1. Reflexive: if B is a subset of A, then A B.

2. Augmentation: if A B then AC BC

3. Transitivity: it A B and B C then A C.

4. Self – determination: A A.

5. Decomposition: If A BC, then AB, AC.

6. Union: it A B and A C, then A BC

7. Composition: if A B, C D then AC BD.

8. Pseudo transitive : if A B and rBC then Ar C.

Page 23: Ch 4: Relational Database Design

ARMSTRONG AXIOMS

• First three axioms

1. Reflexive: if B is a subset of A, then A B.

2. Augmentation: if A B then AC BC

3. Transitivity: it A B and B C then A C.

are sound & complete

By sound, we mean that given a set of FD on relation R, any dependency that can infer from F holds in every reln satisfies the dependencies. They do not generate incorrect FD.

By complete, we mean that using 3 FC repeatedly to a complete set of all possible dependencies that can be inferred from F.

Page 24: Ch 4: Relational Database Design

Example: find closure of F

• R = (A, B, C, G, H, I)F = { A B

A CCG HCG I B H}

• some extra members of F+

– A H

• by transitivity from A B and B H

– AG I

• by augmenting A C with G, to get AG CG and then transitivity with CG I

– CG HI

• by union CG I & CG H

• OR

• by augmenting CG I to infer CG CGI,

and augmenting of CG H to infer CGI HI,

and then transitivity

Page 25: Ch 4: Relational Database Design

Closure of Attribute Sets

• Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F

• Algorithm to compute +, the closure of under F

result := ;while (changes to result) do

for each in F dobegin

if result then result := result end

Page 26: Ch 4: Relational Database Design

Example of Attribute Set Closure

• R = (A, B, C, G, H, I)• F = {A B

A C CG HCG IB H}

• (AG)+

1. result = AG2. result = ABCG (A C and A B)3. result = ABCGH (CG H and CG AGBC)4. result = ABCGHI (CG I and CG AGBCH)

Page 27: Ch 4: Relational Database Design

Canonical Cover

• Sets of functional dependencies may have redundant dependencies that can be inferred from the others– For example: A C is redundant in: {A B, B C}– Parts of a functional dependency may be redundant

• E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}

• E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}

• Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies

Page 28: Ch 4: Relational Database Design

Extraneous Attributes

• Consider a set F of functional dependencies and the functional dependency in F.

– Attribute A is extraneous in if A and F logically implies (F – { }) {( – A) }.

– Attribute A is extraneous in if A and the set of functional dependencies (F – { }) { ( – A)} logically implies F.

• Note: implication in the opposite direction is trivial in each of the cases above, since a “stronger” functional dependency always implies a weaker one

• Example: Given F = {A C, AB C }

– B is extraneous in AB C because {A C, AB C} logically implies A C (I.e. the result of dropping B from AB C).

• Example: Given F = {A C, AB CD}

– C is extraneous in AB CD since AB C can be inferred even after deleting C

Page 29: Ch 4: Relational Database Design

Testing if an Attribute is Extraneous

• Consider a set F of functional dependencies and the functional dependency in F.

• To test if attribute A is extraneous in

1. compute ({} – A)+ using the dependencies in F

2. check that ({} – A)+ contains ; if it does, A is extraneous in

• To test if attribute A is extraneous in

1. compute + using only the dependencies in F’ = (F – { }) { ( – A)},

2. check that + contains A; if it does, A is extraneous in

Page 30: Ch 4: Relational Database Design

Canonical Cover

• A canonical cover for F is a set of dependencies Fc such that – F logically implies all dependencies in Fc, and – Fc logically implies all dependencies in F, and

– No functional dependency in Fc contains an extraneous attribute, and

– Each left side of functional dependency in Fc is unique.• To compute a canonical cover for F:

repeatUse the union rule to replace any dependencies in F

1 1 and 1 2 with 1 1 2 Find a functional dependency with an

extraneous attribute either in or in If an extraneous attribute is found, delete it from

until F does not change• Note: Union rule may become applicable after some extraneous

attributes have been deleted, so it has to be re-applied

Page 31: Ch 4: Relational Database Design

Computing a Canonical Cover• R = (A, B, C)

F = {A BC B C A BAB C}

• Combine A BC and A B into A BC– Set is now {A BC, B C, AB C}

• A is extraneous in AB C– Check if the result of deleting A from AB C is implied by the other

dependencies• Yes: in fact, B C is already present!

– Set is now {A BC, B C}• C is extraneous in A BC

– Check if A C is logically implied by A B and the other dependencies• Yes: using transitivity on A B and B C.

– Can use attribute closure of A in more complex cases• The canonical cover is: A B

B C

Page 32: Ch 4: Relational Database Design

e.g.

• Now we define a set of FD to be irreducible as minimal; if and only if it satisfies the following two properties.

(1) The right hand side of every FD in S involve just one attribute (i.e., it is a singleton set)

(2) The left hand side of every FD in S is irreducible in turn meaning that no attribute can be discarded from the determinant without changing the CLOSURE S+.

Page 33: Ch 4: Relational Database Design

Example

• A BC,• B C• A B• AB C• AC D

Compute an irreducible set of FD that is equivalent to this given set.

Bring answer in reducible form

Page 34: Ch 4: Relational Database Design

Solution

(1) The step is to rewrite the FD such that each has a singleton right hand side.

• A B• A C• B C• A B• AB C• AC DWe observe that the FD A B occurs twice. So one

occurrence will be eliminated.

A BC,B CA B

AB CAC D

Page 35: Ch 4: Relational Database Design

Solution

2. Next, attributed C can be eliminated from the left hand side of the FD AC D

• Because we have A C,• By augmentation AA AC

• A AC

(Augmentation: if X Y then XZ YZ)

• And we are given AC D,• A AC AC D

• So A D by transitivity;

Thus C on the left hand side is redundant.

A CB CA B

AB CAC D

Page 36: Ch 4: Relational Database Design

Solution

3. Next, we observe that the FD AB C can be eliminated, because again we have

A CBy augmentation AB CB

By decomposition AB C AB B

4. Finally, the FD A C is implied by the FD A B and B C, by transitivity so it can be eliminated.Now we have A B

B CA D

This set is irreducible.

A CB CA B

AB CA D

Page 37: Ch 4: Relational Database Design
Page 38: Ch 4: Relational Database Design