Ch 4: Relational Database Design
description
Transcript of Ch 4: Relational Database Design
Ch 4: Relational Database Design
4.1 Features of Good 4.1 Features of Good Relational designsRelational designs
Four Informal measures
•Semantics of the relation attributes
•Reducing the redundant values in tuples.
•Reducing the null values
•Disallowing the possibility of generating spurious(wrong) tuples
1. 1. Semantics of the relation attributes
• Design a realm schema so that it is easy to explain its meaning .Do not combine attributes from multiple entity types & relationship into single relation.
• In general ,the easier it is to explain the semantics of relation, the better the relation schema design will be.
2. Redundant Information in Tuples & 2. Redundant Information in Tuples & Update AnomaliesUpdate Anomalies
• Goal of dbase is to reduce storage space used by relationship.
• Grouping attributes into relation schemas has a significant effect on storage space.
• For e.g. if we combine Employee with department & project , works_on will result into EMP_DEPT & EMP_PRJ.
• Resultant relation shows repetition of several values leading to higher storage.
• Other serious problems is of Update anomalies which is classified as insert, delete & modification anomalies.
Insertion Anomalies Insertion Anomalies
• Insert a new emp , we must include value for dept or need to place NULL (if emp doesnot work for dept yet.) & need to enter correctly so consistency problem donot occur.
• It is difficult to enter new dept that has no employee as we cannot insert Null in ENO as its is primary key
Deletion Anomalies Deletion Anomalies
• If we delete from EMP_DEPT an employee that happens to represent the last employee in that dept , the info abt dept is also lost from dbase
Modification Anomalies Modification Anomalies
• If we change the value of one of attributes say that of manager of dept 5 , we need to change in each tuple where dept no is 5, else it will lead to inconsistent.
• Design Dbase so that no insertion, deletion & modification anomalies are present .
3. Null values in tuples3. Null values in tuples
• NULL have multiple interpretations such as:
• Attributes that do not apply to this tuple.
• Attribute value for this tuple is unknown.
• Value is known but absent, i.e. it has not been recorded yet.
• Problem may occur in JOIN , and aggregrate operations.
4. Generation of Spurious Tuples4. Generation of Spurious Tuples
• Design relation schemas so that they can be joined with equality condition on attributes that are primary or foreign key.
4.2 Functional Dependencies (FD)
• A functional dependency, denoted by X Y (Read X functionally determines Y), between two sets of attributes X and Y that are subsets of R specifies a constraint on possible tuples that can be form a relation state r of R.
• The constraint is that for for all pairs of tuples t1 and t2 in r such that
t1 [X ] = t2 [X ] , they must also have,
t1 [Y] = t2 [Y]. In other words
Whenever two tuples of r agree on their X value, they also agree on their Y value.
Functional Dependency
• Main concept associated with normalization.• Functional Dependency
– Describes relationship between attributes in a relation.
– If A and B are attributes of relation R, B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R.
Functional Dependency
• Diagrammatic representation:
Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow.
Example - Functional Dependency
Functional Dependencies (Cont.)
• K is a superkey for relation schema R if and only if K R
• K is a candidate key for R if and only if
– K R, and
– for no K, R
• Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:
bor_loan = (customer_id, loan_number, amount ).
We expect this functional dependency to hold:
loan_number amount
but would not expect the following to hold:
amount customer_name
Functional Dependencies (Cont.)
• Main use of FD is to describe further relation schema R by specifying constraints on its attributes that must hold all times.
• Certain FD can be specified without referring to specific relation.
• {state, driving_licence} ENO• {pincode) area• {telephone code}city
Functional Dependencies (Cont.)
ENOEname
Pnumber{pname,plocation}
{eno,pnumber}hours• Eno uniquely determines emp name• Pnumber uniquely determines project name & location• Combination of eno , pnumber uniquely determines
numbers of hours that employee had worked on that project.
• FD plays a key role in differentiating good DB design from bad DB design.
EXAMPLE :: TEACH
TEACHER COURSE TEXT
GUPTA CHEMISTRY SAHANI
GUPTA MATHS NAVATHE
KUMAR BIOLOGY HOFFMAN
GOYAL CHEMISTRY KAHATE
Possible FD’s
TEXTCOURSE hold
But,
TEACHERCOURSE is ruled out.
Use of Functional Dependency
• To test relations to see whether they are legal under a given set of functional dependencies. If a relation r is legal under a set F of FD’s, we say that r satisfies F.
• To specify constraints on the set of legal relations. If we wish to constrain ourselves to relations on schema R that satisfy a set F of functional dependencies, we say that F holds on K.
FD – A Few More Examples
• Suppose one is designing a system to track vehicles and the capacity of their engines. Each vehicle has a unique vehicle identification number (VIN). One would write VIN → EngineCapacity because it would be inappropriate for a vehicle's engine to have more than one capacity. (Assuming, in this case, that vehicles only have one engine.)
• However, EngineCapacity → VIN, is incorrect because there could be many vehicles with the same engine capacity.
Trivial and Non Trivial FD
• Trivial FD : A FD X->Y is trivial if Y, the right hand side of the functional dependency is a subset of X.
• Eg. : A FD • {EmpID, EmpAddress}->{EmpAddress} is trivial, as
{EmpAddress} is a subset of {EmpID,EmpAddress}.• Non Trivial FD : A FD is called Nontrivial if Y is not a
subset of X.• Eg. : A FD• {EmpID,EmpAddress}->{EmpPhone} is non trivial, as
{EmpPhone} is not subset of {EmpID,EmpAddress}
Closure
• The set of all FDs that include F as well as all dependencies that are implied by a given set F of FDs is called the closure of F, denoted by F+.
• F= ENO{ ENAME,DOB,ADDRESS,DNUM}
DNUMBER{DNAME,MGRNO}
SOME ADDITIONAL FD’S ARE
ENO DNAME,MGRNO
DUMBERDNAME
Inference rules for FD Or AXIOMS
1. Reflexive: if B is a subset of A, then A B.
2. Augmentation: if A B then AC BC
3. Transitivity: it A B and B C then A C.
4. Self – determination: A A.
5. Decomposition: If A BC, then AB, AC.
6. Union: it A B and A C, then A BC
7. Composition: if A B, C D then AC BD.
8. Pseudo transitive : if A B and rBC then Ar C.
ARMSTRONG AXIOMS
• First three axioms
1. Reflexive: if B is a subset of A, then A B.
2. Augmentation: if A B then AC BC
3. Transitivity: it A B and B C then A C.
are sound & complete
By sound, we mean that given a set of FD on relation R, any dependency that can infer from F holds in every reln satisfies the dependencies. They do not generate incorrect FD.
By complete, we mean that using 3 FC repeatedly to a complete set of all possible dependencies that can be inferred from F.
Example: find closure of F
• R = (A, B, C, G, H, I)F = { A B
A CCG HCG I B H}
• some extra members of F+
– A H
• by transitivity from A B and B H
– AG I
• by augmenting A C with G, to get AG CG and then transitivity with CG I
– CG HI
• by union CG I & CG H
• OR
• by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity
Closure of Attribute Sets
• Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F
• Algorithm to compute +, the closure of under F
result := ;while (changes to result) do
for each in F dobegin
if result then result := result end
Example of Attribute Set Closure
• R = (A, B, C, G, H, I)• F = {A B
A C CG HCG IB H}
• (AG)+
1. result = AG2. result = ABCG (A C and A B)3. result = ABCGH (CG H and CG AGBC)4. result = ABCGHI (CG I and CG AGBCH)
Canonical Cover
• Sets of functional dependencies may have redundant dependencies that can be inferred from the others– For example: A C is redundant in: {A B, B C}– Parts of a functional dependency may be redundant
• E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}
• E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}
• Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies
Extraneous Attributes
• Consider a set F of functional dependencies and the functional dependency in F.
– Attribute A is extraneous in if A and F logically implies (F – { }) {( – A) }.
– Attribute A is extraneous in if A and the set of functional dependencies (F – { }) { ( – A)} logically implies F.
• Note: implication in the opposite direction is trivial in each of the cases above, since a “stronger” functional dependency always implies a weaker one
• Example: Given F = {A C, AB C }
– B is extraneous in AB C because {A C, AB C} logically implies A C (I.e. the result of dropping B from AB C).
• Example: Given F = {A C, AB CD}
– C is extraneous in AB CD since AB C can be inferred even after deleting C
Testing if an Attribute is Extraneous
• Consider a set F of functional dependencies and the functional dependency in F.
• To test if attribute A is extraneous in
1. compute ({} – A)+ using the dependencies in F
2. check that ({} – A)+ contains ; if it does, A is extraneous in
• To test if attribute A is extraneous in
1. compute + using only the dependencies in F’ = (F – { }) { ( – A)},
2. check that + contains A; if it does, A is extraneous in
Canonical Cover
• A canonical cover for F is a set of dependencies Fc such that – F logically implies all dependencies in Fc, and – Fc logically implies all dependencies in F, and
– No functional dependency in Fc contains an extraneous attribute, and
– Each left side of functional dependency in Fc is unique.• To compute a canonical cover for F:
repeatUse the union rule to replace any dependencies in F
1 1 and 1 2 with 1 1 2 Find a functional dependency with an
extraneous attribute either in or in If an extraneous attribute is found, delete it from
until F does not change• Note: Union rule may become applicable after some extraneous
attributes have been deleted, so it has to be re-applied
Computing a Canonical Cover• R = (A, B, C)
F = {A BC B C A BAB C}
• Combine A BC and A B into A BC– Set is now {A BC, B C, AB C}
• A is extraneous in AB C– Check if the result of deleting A from AB C is implied by the other
dependencies• Yes: in fact, B C is already present!
– Set is now {A BC, B C}• C is extraneous in A BC
– Check if A C is logically implied by A B and the other dependencies• Yes: using transitivity on A B and B C.
– Can use attribute closure of A in more complex cases• The canonical cover is: A B
B C
e.g.
• Now we define a set of FD to be irreducible as minimal; if and only if it satisfies the following two properties.
(1) The right hand side of every FD in S involve just one attribute (i.e., it is a singleton set)
(2) The left hand side of every FD in S is irreducible in turn meaning that no attribute can be discarded from the determinant without changing the CLOSURE S+.
Example
• A BC,• B C• A B• AB C• AC D
Compute an irreducible set of FD that is equivalent to this given set.
Bring answer in reducible form
Solution
(1) The step is to rewrite the FD such that each has a singleton right hand side.
• A B• A C• B C• A B• AB C• AC DWe observe that the FD A B occurs twice. So one
occurrence will be eliminated.
A BC,B CA B
AB CAC D
Solution
2. Next, attributed C can be eliminated from the left hand side of the FD AC D
• Because we have A C,• By augmentation AA AC
• A AC
(Augmentation: if X Y then XZ YZ)
• And we are given AC D,• A AC AC D
• So A D by transitivity;
Thus C on the left hand side is redundant.
A CB CA B
AB CAC D
Solution
3. Next, we observe that the FD AB C can be eliminated, because again we have
A CBy augmentation AB CB
By decomposition AB C AB B
4. Finally, the FD A C is implied by the FD A B and B C, by transitivity so it can be eliminated.Now we have A B
B CA D
This set is irreducible.
A CB CA B
AB CA D