Ch 4: Relational Database Design

4.1 Features of Good 4.1 Features of Good Relational designsRelational designs

Four Informal measures

•Semantics of the relation attributes

•Reducing the redundant values in tuples.

•Reducing the null values

•Disallowing the possibility of generating spurious(wrong) tuples

1. 1. Semantics of the relation attributes

• Design a realm schema so that it is easy to explain its meaning .Do not combine attributes from multiple entity types & relationship into single relation.

• In general ,the easier it is to explain the semantics of relation, the better the relation schema design will be.

2. Redundant Information in Tuples & 2. Redundant Information in Tuples & Update AnomaliesUpdate Anomalies

• Goal of dbase is to reduce storage space used by relationship.

• Grouping attributes into relation schemas has a significant effect on storage space.

• For e.g. if we combine Employee with department & project , works_on will result into EMP_DEPT & EMP_PRJ.

• Resultant relation shows repetition of several values leading to higher storage.

• Other serious problems is of Update anomalies which is classified as insert, delete & modification anomalies.

Insertion Anomalies Insertion Anomalies

• Insert a new emp , we must include value for dept or need to place NULL (if emp doesnot work for dept yet.) & need to enter correctly so consistency problem donot occur.

• It is difficult to enter new dept that has no employee as we cannot insert Null in ENO as its is primary key

Deletion Anomalies Deletion Anomalies

• If we delete from EMP_DEPT an employee that happens to represent the last employee in that dept , the info abt dept is also lost from dbase

Modification Anomalies Modification Anomalies

• If we change the value of one of attributes say that of manager of dept 5 , we need to change in each tuple where dept no is 5, else it will lead to inconsistent.

• Design Dbase so that no insertion, deletion & modification anomalies are present .

3. Null values in tuples3. Null values in tuples

• NULL have multiple interpretations such as:

• Attributes that do not apply to this tuple.

• Attribute value for this tuple is unknown.

• Value is known but absent, i.e. it has not been recorded yet.

• Problem may occur in JOIN , and aggregrate operations.

4. Generation of Spurious Tuples4. Generation of Spurious Tuples

• Design relation schemas so that they can be joined with equality condition on attributes that are primary or foreign key.

4.2 Functional Dependencies (FD)

• A functional dependency, denoted by X Y (Read X functionally determines Y), between two sets of attributes X and Y that are subsets of R specifies a constraint on possible tuples that can be form a relation state r of R.

• The constraint is that for for all pairs of tuples t1 and t2 in r such that

t1 [X ] = t2 [X ] , they must also have,

t1 [Y] = t2 [Y]. In other words

Whenever two tuples of r agree on their X value, they also agree on their Y value.

Functional Dependency

• Main concept associated with normalization.• Functional Dependency

– Describes relationship between attributes in a relation.

– If A and B are attributes of relation R, B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R.

Functional Dependency

• Diagrammatic representation:

Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow.

Example - Functional Dependency

Functional Dependencies (Cont.)

• K is a superkey for relation schema R if and only if K R

• K is a candidate key for R if and only if

– K R, and

– for no K, R

• Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:

bor_loan = (customer_id, loan_number, amount ).

We expect this functional dependency to hold:

loan_number amount

but would not expect the following to hold:

amount customer_name


• Main use of FD is to describe further relation schema R by specifying constraints on its attributes that must hold all times.

• Certain FD can be specified without referring to specific relation.

• {state, driving_licence} ENO• {pincode) area• {telephone code}city


ENOEname

Pnumber{pname,plocation}

{eno,pnumber}hours• Eno uniquely determines emp name• Pnumber uniquely determines project name & location• Combination of eno , pnumber uniquely determines

numbers of hours that employee had worked on that project.

• FD plays a key role in differentiating good DB design from bad DB design.

EXAMPLE :: TEACH

TEACHER COURSE TEXT

GUPTA CHEMISTRY SAHANI

GUPTA MATHS NAVATHE

KUMAR BIOLOGY HOFFMAN

GOYAL CHEMISTRY KAHATE

Possible FD’s

TEXTCOURSE hold

But,

TEACHERCOURSE is ruled out.

Use of Functional Dependency

• To test relations to see whether they are legal under a given set of functional dependencies. If a relation r is legal under a set F of FD’s, we say that r satisfies F.

• To specify constraints on the set of legal relations. If we wish to constrain ourselves to relations on schema R that satisfy a set F of functional dependencies, we say that F holds on K.

FD – A Few More Examples

• Suppose one is designing a system to track vehicles and the capacity of their engines. Each vehicle has a unique vehicle identification number (VIN). One would write VIN → EngineCapacity because it would be inappropriate for a vehicle's engine to have more than one capacity. (Assuming, in this case, that vehicles only have one engine.)

• However, EngineCapacity → VIN, is incorrect because there could be many vehicles with the same engine capacity.

http://en.wikipedia.org/wiki/Vehicle_identification_number

Trivial and Non Trivial FD

• Trivial FD : A FD X->Y is trivial if Y, the right hand side of the functional dependency is a subset of X.

• Eg. : A FD • {EmpID, EmpAddress}->{EmpAddress} is trivial, as

{EmpAddress} is a subset of {EmpID,EmpAddress}.• Non Trivial FD : A FD is called Nontrivial if Y is not a

subset of X.• Eg. : A FD• {EmpID,EmpAddress}->{EmpPhone} is non trivial, as

{EmpPhone} is not subset of {EmpID,EmpAddress}

Closure

• The set of all FDs that include F as well as all dependencies that are implied by a given set F of FDs is called the closure of F, denoted by F+.

• F= ENO{ ENAME,DOB,ADDRESS,DNUM}

DNUMBER{DNAME,MGRNO}

SOME ADDITIONAL FD’S ARE

ENO DNAME,MGRNO

DUMBERDNAME

Inference rules for FD Or AXIOMS

1. Reflexive: if B is a subset of A, then A B.

2. Augmentation: if A B then AC BC

3. Transitivity: it A B and B C then A C.

4. Self – determination: A A.

5. Decomposition: If A BC, then AB, AC.

6. Union: it A B and A C, then A BC

7. Composition: if A B, C D then AC BD.

8. Pseudo transitive : if A B and rBC then Ar C.

ARMSTRONG AXIOMS

• First three axioms

1. Reflexive: if B is a subset of A, then A B.

2. Augmentation: if A B then AC BC

3. Transitivity: it A B and B C then A C.

are sound & complete

By sound, we mean that given a set of FD on relation R, any dependency that can infer from F holds in every reln satisfies the dependencies. They do not generate incorrect FD.

By complete, we mean that using 3 FC repeatedly to a complete set of all possible dependencies that can be inferred from F.

Example: find closure of F

• R = (A, B, C, G, H, I)F = { A B

A CCG HCG I B H}

• some extra members of F+

– A H

• by transitivity from A B and B H

– AG I

• by augmenting A C with G, to get AG CG and then transitivity with CG I

– CG HI

• by union CG I & CG H

• OR

• by augmenting CG I to infer CG CGI,

and augmenting of CG H to infer CGI HI,

and then transitivity

Closure of Attribute Sets

• Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F

• Algorithm to compute +, the closure of under F

result := ;while (changes to result) do

for each in F dobegin

if result then result := result end

Example of Attribute Set Closure

• R = (A, B, C, G, H, I)• F = {A B

A C CG HCG IB H}

• (AG)+

1. result = AG2. result = ABCG (A C and A B)3. result = ABCGH (CG H and CG AGBC)4. result = ABCGHI (CG I and CG AGBCH)

Canonical Cover

• Sets of functional dependencies may have redundant dependencies that can be inferred from the others– For example: A C is redundant in: {A B, B C}– Parts of a functional dependency may be redundant

• E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}

• E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}

• Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies

Extraneous Attributes

• Consider a set F of functional dependencies and the functional dependency in F.

– Attribute A is extraneous in if A and F logically implies (F – { }) {( – A) }.

– Attribute A is extraneous in if A and the set of functional dependencies (F – { }) { ( – A)} logically implies F.

• Note: implication in the opposite direction is trivial in each of the cases above, since a “stronger” functional dependency always implies a weaker one

• Example: Given F = {A C, AB C }

– B is extraneous in AB C because {A C, AB C} logically implies A C (I.e. the result of dropping B from AB C).

• Example: Given F = {A C, AB CD}

– C is extraneous in AB CD since AB C can be inferred even after deleting C

Testing if an Attribute is Extraneous

• Consider a set F of functional dependencies and the functional dependency in F.

• To test if attribute A is extraneous in

1. compute ({} – A)+ using the dependencies in F

2. check that ({} – A)+ contains ; if it does, A is extraneous in

• To test if attribute A is extraneous in

1. compute + using only the dependencies in F’ = (F – { }) { ( – A)},

2. check that + contains A; if it does, A is extraneous in

Canonical Cover

• A canonical cover for F is a set of dependencies Fc such that – F logically implies all dependencies in Fc, and – Fc logically implies all dependencies in F, and

– No functional dependency in Fc contains an extraneous attribute, and

– Each left side of functional dependency in Fc is unique.• To compute a canonical cover for F:

repeatUse the union rule to replace any dependencies in F

1 1 and 1 2 with 1 1 2 Find a functional dependency with an

extraneous attribute either in or in If an extraneous attribute is found, delete it from

until F does not change• Note: Union rule may become applicable after some extraneous

attributes have been deleted, so it has to be re-applied

Computing a Canonical Cover• R = (A, B, C)

F = {A BC B C A BAB C}

• Combine A BC and A B into A BC– Set is now {A BC, B C, AB C}

• A is extraneous in AB C– Check if the result of deleting A from AB C is implied by the other

dependencies• Yes: in fact, B C is already present!

– Set is now {A BC, B C}• C is extraneous in A BC

– Check if A C is logically implied by A B and the other dependencies• Yes: using transitivity on A B and B C.

– Can use attribute closure of A in more complex cases• The canonical cover is: A B

B C

e.g.

• Now we define a set of FD to be irreducible as minimal; if and only if it satisfies the following two properties.

(1) The right hand side of every FD in S involve just one attribute (i.e., it is a singleton set)

(2) The left hand side of every FD in S is irreducible in turn meaning that no attribute can be discarded from the determinant without changing the CLOSURE S+.

Example

• A BC,• B C• A B• AB C• AC D

Compute an irreducible set of FD that is equivalent to this given set.

Bring answer in reducible form

Solution

(1) The step is to rewrite the FD such that each has a singleton right hand side.

• A B• A C• B C• A B• AB C• AC DWe observe that the FD A B occurs twice. So one

occurrence will be eliminated.

A BC,B CA B

AB CAC D

Solution

2. Next, attributed C can be eliminated from the left hand side of the FD AC D

• Because we have A C,• By augmentation AA AC

• A AC

(Augmentation: if X Y then XZ YZ)

• And we are given AC D,• A AC AC D

• So A D by transitivity;

Thus C on the left hand side is redundant.

A CB CA B

AB CAC D

Solution

3. Next, we observe that the FD AB C can be eliminated, because again we have

A CBy augmentation AB CB

By decomposition AB C AB B

4. Finally, the FD A C is implied by the FD A B and B C, by transitivity so it can be eliminated.Now we have A B

B CA D

This set is irreducible.

A CB CA B

AB CA D

Ch 4: Relational Database Design

Documents

Transcript of Ch 4: Relational Database Design