Relational Design Theory Assess the quality of a schema redundancy integrity constraints Quality...

82
Relational Design Theory Assess the quality of a schema redundancy integrity constraints Quality seal: normal forms (1-4, BCNF) Improve the quality of a schema synthesis algorithm decomposition algorithm Construct a (high-quality) schema start with universal relation apply synthesis or decomposition algorithms 1

Transcript of Relational Design Theory Assess the quality of a schema redundancy integrity constraints Quality...

Page 1: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Relational Design TheoryAssess the quality of a schema redundancy integrity constraintsQuality seal: normal forms (1-4, BCNF)

Improve the quality of a schemasynthesis algorithmdecomposition algorithm

Construct a (high-quality) schemastart with universal relationapply synthesis or decomposition algorithms

1

Page 2: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

What is wrong with redundancy? Waste of storage space importance is diminishing as storage gets cheaper (disk density will even increase in the future)

Additional work to keep multiple copies of data consistentmultiple updates in order to accomodate one event

Additional code to keep multiple copies of data consistentSomebody needs to implement the logic

2

Page 3: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Bad Schemas

Update-AnomalyWhat happens when Sokrates moves to a different

room? Insert-AnomalyWhat happens if Roscoe is elected as a new professor?

Delete-AnomalyWhat happens if Popper does not teach this semester?

ProfLecturePersNr Name Level Roo

mNr Title CP

2125 Sokrates

FP 226 5041 Ethik 4

2125 Sokrates

FP 226 5049 Mäeutik 2

2125 Sokrates

FP 226 4052 Logik 4

... ... ... ... ... ... ...2132 Popper AP 52 5259 Der Wiener Kreis 22137 Kant FP 7 4630 Die 3 Kritiken 4

3

Page 4: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Multi-version Databases Storage becomes cheaper -> never throw anything away It is more expensive to think about what to keep than

simply to keep everything.

Consequence 1: No delete Instead, set a status flag to „deleted“No delete anomalies (only wasted storage)

Consequence 2: No update in place Instead, create a new version of the tupleNo update anomalies (only wasted storage)

Insert anomalies still exist, but not a big problemResult in multiple NULL values, but no inconsistencies

NoSQL Movement: Denormalized data (XML is great!) 4

Page 5: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Functional Dependencies Schema: R = {A:DA, B:DB, C:DC, D:DD} Instance: R

Let R, R iff r, s R: r. = s. r. = s. (There is a function f: X D X D )

RA B C Da4 b2 c4 d3

a1 b1 c1 d1

a1 b1 c1 d2

a2 b2 c3 d2

a3 b2 c4 d3

{A} {B}

{C, D } {B}

Not: {B} {C}

Convention:

CD B 5

Page 6: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

ExampleFamily Tree

Child Father Mother Grandma GrandpaSofie Alfons Sabine Lothar LindeSofie Alfons Sabine Hubert LisaNiklas Alfons Sabine Lothar LindeNiklas Alfons Sabine Hubert Lisa

... ... ... Lothar Martha… … … … …

6

Page 7: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

ExampleFamily Tree

Child Father Mother Grandma GrandpaSofie Alfons Sabine Lothar LindeSofie Alfons Sabine Hubert LisaNiklas Alfons Sabine Lothar LindeNiklas Alfons Sabine Hubert Lisa

... ... ... Lothar Martha… … … … …

Child Father,Mother Child, Grandpa Grandma Child, Grandma Grandpa

7

Page 8: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Analogy to functions f1 : Child FatherE.g., f1(Niklas) = Alfons

f2: Child MotherE.g., f2(Niklas) = Sabine

f3: Child x Grandpa Grandma

FD: Child Father, Mother represents two functions (f1, f2) Komma on right side indicates multiple functions

FD: Child, Grandpa GrandmaKomma on the left side indicates Cartesian product 8

Page 9: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

KeysR is a superkey iffR

is minimal iffA (

Notation for minimal functional dependencies:

R is a key (or candidate key) iffR

9

Page 10: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Determining Keys

Keys of Town {Name, Canton} {Name, AreaCode}

N.B. Two small towns may have the same area code.

TownName Canton AreaCod

ePopulatio

nBuchs AG 081 6500Buchs SG 071 8000Zurich ZH 044 300000

Lausanne

VD 021 60000

... ... ... ...

10

Page 11: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Determining Functional Dependencies Professor: {[PersNr, Name, Level, Room, City, Address,

Zip, AreaCode, Canton, Population, Direktion]}{PersNr} {PersNr, Name, Level, Room, City,

Address, Zip, AreaCode, Canton, Population, Direktion}

{City, Canton} {Population, AreaCode}{Zip} {Canton, City, Population}{Canton, City, Address} {Zip}{Canton} {Direktion}{Room} {PersNr}

Additional functional dependencies (inferred):{Room} {PersNr, Name, Level, Room, City,

Address, Zip, AreaCode, Canton, Population, Direktion}

{Zip} {Direktion} 11

Page 12: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Visualization of Funct. Dependencies

Direktion

Level

Name

Address

City

Canton

PersNr

Room

AreaCode

Zip

Population

12

Page 13: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Armstrong Axioms: Inference of FDsReflexivity ( ) Special case:

Augmentation (Notation :=

Transitivity .

These three axioms are complete. All possible other rules can be implied from these axioms.

13

Page 14: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Other rulesUnion of FDs:

Decomposition:

Pseudo transitivity:

14

Page 15: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Correctness of Union rule Premise: and Claim: Proof:1 (Premise)2 (Augmentation)3 (Premise)4 (Augmentation)5 (Transitivity of (4) and (2)) qed

15

Page 16: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Closure of Attributes Input: F: a set of FDs: a set of attributes

Output: + such that +

Closure(F, )result := // Reflexivitywhile (result has changed) do

foreach FD: in F do // Transitivityif result then result := result

output(result)

Exercise: Proof that Closure is deterministic and terminates.

16

Page 17: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example: Closure of ZIP (Slide 8)

17

Page 18: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Minimal BasisFc is a minimal basis of F iff:

1. Fc F The closure of all attribute set is the same in Fc and

F

2. All functional dependencies in Fc are minimal: A Fc - ((( Fc B Fc - (( Fc

3. In Fc, there are no two functional dependencies with the same left side. Can be achieved by applying the Union rule.

18

Page 19: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Computing the Minimum Basis1. Reduction of left sides of FDs. Let F, A :if Closure(F, – A)then replace with (- A)in F

2. Reduction of right sides of FDs. Let F, B :if B Closure(F – () (), )

then replace with –B) in F

1. Remove FDs: (clean-up of Step 2)

2. Apply Union rule to FDs with the same left side. 19

Page 20: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Determining Functional Dependencies Professor: {[PersNr, Name, Level, Room, City, Address,

Zip, AreaCode, Canton, Population, Direktion]}{PersNr} {PersNr, Name, Level, Room, City,

Address, Zip, AreaCode, Canton, Population, Direktion}

{City, Canton} {Population, AreaCode}{Zip} {Canton, City, Population}{Canton, City, Address} {Zip}{Canton} {Direktion}{Room} {PersNr}

Additional functional dependencies (inferred):{Room} {PersNr, Name, Level, Room, City,

Address, Zip, AreaCode, Canton, Population, Direktion}

{Zip} {Direktion} 20

Page 21: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Correctness of the Algorithm(Left Reduction)

Premise: Closure(F, - A)

Claim: Closure(F-{}(- A) , - A) Closure(F, - A)

Proof:

let Closure(F-{}(- A) , - A) Closure(F, - A ) (Apply FD (- A))

Closure(F, - A) (Premise) qed

21

Page 22: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Bad Schemas

Update-AnomalyWhat happens when Sokrates moves to a different

room? Insert-AnomalyWhat happens if Roscoe is elected as a new professor?

Delete-AnomalyWhat happens if Popper does not teach this semester?

ProfLecturePersNr Name Level Roo

mNr Title CP

2125 Sokrates

FP 226 5041 Ethik 4

2125 Sokrates

FP 226 5049 Mäeutik 2

2125 Sokrates

FP 226 4052 Logik 4

... ... ... ... ... ... ...2132 Popper AP 52 5259 Der Wiener Kreis 22137 Kant FP 7 4630 Die 3 Kritiken 4

22

Page 23: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition of Relations Bad relations combine several concepts decompose them so that each concept in one

relation R R1, .., Rn

1. Lossless Decomposition

R =R1 A R2 A ... A Rn

2. Preservation of Dependencies

FD(R)+ = (FD(R1) ... FD(Rn))+

23

Page 24: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

When is a decomposition lossless? Let R = R1 R2

R1 := ΠR1 (R) R2 := ΠR2 (R)

Lemma: The decomposition is lossless if (R1 R2) R1 or (R1 R2) R2

Exercise: Proof of this Lemma.

RR1

R2

24

Page 25: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example

DrinkerPub Guest Beer

Kowalski Kemper Pils

Kowalski Eickler Hefeweizen

Innsteg Kemper Hefeweizen

25

Page 26: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Lossy DecompositionDrinker

Pub Guest BeerKowalsk

iKemper Pils

Kowalski

Eickler Hefeweizen

Innsteg Kemper Hefeweizen

VisitorPub Guest

Kowalski

Kemper

Kowalski

Eickler

Innsteg Kemper

DrinksGuest Beer

Kemper PilsEickler Hefeweiz

enKemper Hefeweiz

en

Guest, BeerPub, Guest

26

Page 27: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

DrinkerKneipe Gast Bier

Kowalski Kemper PilsKowalski Eickler HefeweizenInnsteg Kemper Hefeweizen

VisitorPub Guest

Kowalski KemperKowalski EicklerInnsteg Kemper

DrinksGuest Beer

Kemper PilsEickler Hefeweize

nKemper Hefeweize

n

....

VisitorA DrinksPub Guest Beer

Kowalski Kemper PilsKowalski Kemper HefeweizenKowalski Eickler HefeweizenInnsteg Kemper PilsInnsteg Kemper Hefeweizen

A≠

27

Page 28: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Comments on the ExampleDrinker has one (non-trivial) functional

dependency{Pub,Guest}{Beer}

But none of the criteria of the Lemma hold{Guest}{Beer}{Guest}{Pub}

The problem is that Kemper likes different beer in different pubs.

28

Page 29: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Lossless DecompositionParents

Father Mother ChildJohann Martha ElseJohann Maria TheoHeinz Martha Cleo

FatherFather ChildJohann ElseJohann TheoHeinz Cleo

MotherMother ChildMartha ElseMaria Theo

Martha Cleo

Mother, ChildFather, Child

29

Page 30: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Comments on Example Parents: {[Father, Mother, Child]} Father: {[Father, Child]} Mother: {[Mother, Child]}

Actually, both criteria of the lemma are met: {Child}{Mother}{Child}{Father}

{Child} is a key of all three relationsWrt loss of info, it never hurts to decompose with a

keyHowever, it is never beneficial either. Why?

30

Page 31: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Preservation of Dependencies Let R be decomposed into R1, ..., Rn FR = (FR1 ... FRn)

ZipCodes: {[Street, City, Canton, Zip]}

Functional dependencies in ZipCodes{Zip} {City, Canton}{Street, City, Canton} {Zip}

What about this decomposition?Streets: {[Zip, Street]}Cities: {[Zip, City, Canton]}

Is it lossless? Does it preserve functional depend.?

31

Page 32: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition of ZipCodesZipCodes

City Canton Street ZipBuchs AG Goethestr. 5033Buchs AG Schillerstr. 5034Buchs SG Goethestr. 8107

StreetsZip Street

8107 Goethestr.5033 Goethestr.5034 Schillerstr.

CitiesCity Canton Zip

Buchs AG 5033Buchs AG 5034Buchs SG 8107

City,Canton,ZipZip,Street

{Street, City, Canton} {Zip} not checkable in decomp. schema It is possible to insert inconsistent tuples 32

Page 33: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Violation of City,Canton,StreetZip

ZipCodesCity Canton Street Zip

Buchs AG Goethestr. 5033Buchs AG Schillerstr. 5034Buchs SG Goethestr. 8107

StreetsZip Street

8107 Goethestr.5033 Goethestr.5034 Schillerstr.8108 Goethestr.

CitiesCity Canton Zip

Buchs AG 5033Buchs AG 5034Buchs SG 8107Buchs SG 8108

City,Canton,ZipZip,Street

33

Page 34: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Violation of City,Canton,StreetZip

ZipCodesCity Canton Street Zip

Buchs AG Goethestr. 5033Buchs AG Schillerstr. 5034Buchs SG Goethestr. 8107Buchs SG Goethestr. 8108

StreetsZip Street

8107 Goethestr.5033 Goethestr.5034 Schillerstr.8108 Goethestr.

CitiesCity Cantion Zip

Buchs AG 5033Buchs AG 5034Buchs SG 8107Buchs SG 8108

A

34

Page 35: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

First Normal Form Only atomic domains (as in SQL 92)

vs.

ParentsFather Mother ChildrenJohann Martha {Else, Lucie}Johann Maria {Theo, Josef}Heinz Martha {Cleo}

ParentsFather Mother ChildJohann Martha ElseJohann Martha LucieJohann Maria TheoJohann Maria JosefHeinz Martha Cleo

35

Page 36: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Second Normal FormR is in 2NF iff every non-key attribute is

minimally dependent on every key.

StudentAttends is not in 2NF!!! {Legi} {Name, Semester}

StundentAttendsLegi Nr Name Semest

er26120 5001 Fichte 1027550 5001 Schopenhaue

r6

27550 4052 Schopenhauer

6

28106 5041 Carnap 328106 5052 Carnap 328106 5216 Carnap 328106 5259 Carnap 3

... ... ... ...

36

Page 37: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Second Normal Form

Insert Anomaly: What about students who attend no lecture?

Update Anomaly: Promotion of Carnab to the 4th semester. Delete Anomaly: Fichte drops his last course?

Solution: Decompose into two relationsattends: {[Legi, Nr]}Student: {[Legi, Name, Semester]}

Student, attends are in 2NF. The decompostion is lossless and preserves dependencies.

Legi

Nr

Name

Semester

37

Page 38: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

2NF and ER ModellingViolation of 2NFmixing an entity with an N:M (or 1:N) relationshipE.g., mixing Student (entity) with attends (N:M)

SolutionSeparate: entity and relationship i.e., implement entity and relationship in separate

relations

However, okay to mix entity and 1:1/N:1 relationship

Professor Lecturegives1 N

Not okay Okay

38

Page 39: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Third Normal Form R is in 3NF iff for all in R at least one condition

holds:B (i.e., is trivial)is an attribute of at least one keyis a superkey of R

If does not fulfill any of these conditionsis a concept in its own right.

39

Page 40: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example: 2NF but not 3NF

Direktion

Level

Name

Address

City

Canton

PersNr

Room

AreaCode

Zip

Population

40

Page 41: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

3NF and ER ModellingViolation of 3NFmixing several entities (maybe connected by

relationships)e.g., Professor, City, Canton

Solution implement each entity in a separate relation (implement N:M relationships in separate relation)

ER Modelling and Rules of ER -> relationalAutomatically create 3NF

Professor Citylives1N

Canton1

belongsN

OkayOkayNot okay

41

Page 42: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

3NF implies 2NF Premise: R is in 3NF Claim: R is in 2NF Proof:assume R is not in 2NFBy definition of 2NF: exists such that(1) B is not part of any key(2) is a key

is evilit is not trivial (otherwise B would be part of a key)B is not part of any key (1) is not a superkey (2)

R is not in 3NF. qed

42

Page 43: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Synthesis Algorithm Input: Relation R, FDs F

Output: R1, ..., Rn such that

R1, ..., Rn is a lossless decomposition of R.

R1, ..., Rn preserves dependencies.

All R1, ..., Rn are in 3NF.

43

Page 44: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Synthesis Algorithm

1. Compute the minimal basis Fc of F.

2. For all Fc create: R :=

3. If exists R such that is a key of R create: R (N.B.: R has no non-trivial functional dependencies.)

4. Eliminate R if exists R` such that: R R`

44

Page 45: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example: Synthesis Algorithm Professor: {[PersNr, Name, Level, Room, City, Street,

Zip, AreaCode, Canton, Population, Direktion]}1. {PersNr} {Name, Level, Room, Canton, Street,

Canton}2. {Room} {PersNr}3. {Street, Canton, City} {Zip}4. {City, Canton} {Population, AreaCode}5. {Canton} {Direktion} 6. {Zip} {Canton, City}

Professor: {[PersNr, Name, Level, Room, City, Street, Canton]}

ZipCodes: {[Street, Canton, City, Zip]} Cities: {[City, Canton, Population, AreaCode]} Administration: {[Canton, Direktion]} 45

Page 46: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example why Step 3 is needed StudentAttends(Legi, Nr, Name, Semester)

Minimum Basis (Step 1){Legi} {Name, Semester}

Relation generated from minimum basis (Step 2)Student(Legi, Name, Semester)

Relation generated from Step 3 attends(Legi, Nr)

The attends relation is needed!46

Page 47: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Corner Case: Step 3R(A, B, C, D) B -> C, DD -> B

Keys of RA, BA, D

Decomposition into 3NF (Synthesis Algorithm)R1(B, C, D)R2(A, B)

N.B. R3(A,D) is not needed!!!Needs to be cleaned up in Step 4!

47

Page 48: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

ZipCodes(Street, Canton, City, Zip) Is ZipCodes in 3NF?Keys: {Street,Canton,City}, {Zip,Street}All attributes are part of keys. There are no evil FDs!

Does the decomposition preserve dependencies?Yes!

Is the decomposition lossless?Professor ZipCodes = {Street,Canton,City}{Street,Canton,City} ZipCodesCriterion of Lemma is fullfilled!

Is ZipCode free of redundancy?48

Page 49: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

ExercisesProof for the following lemmas: The synthesis algorithm preserves dependencies.The synthesis algorithm creates lossless

decompositions.The synthesis algorithm creates relations in 3NF only.The synthesis algorithm creates relations in 2NF only.

49

Page 50: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Synthesis Algo produces 3NF only Let Ri be a relation created by the Synthesis AlgoCase 1: Ri was created in Step 3 of the algoRi contains a key of R there are no non-trivial FDs in Ri Ri is in 3NF

Case 2: Ri was created in Step 2 by an FD: (1) Ri := (2) is a key of Ri

is minimal because of left reduction of minimal basisRi by construction of Ri

(3) is not evil because is a superkey of Ri

(4) Let be any other non-trivial FD () because of right reduction in minimal basis and because

contains only attributes of a key; is not evil qed50

Page 51: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Boyce-Codd-Normal Form (BCNF ) R is in BCNF iff for all in R at least one condition

holds:B (i.e., is trivial)is a superkey of R

R in BCNF implies R in 3NFProof trivial from definition

Resultany schema can be decomposed losslessly into BCNFbut, preservation of dependencies cannot be

guaranteedneed to trade „correctness“ for „efficiency“ that is why 3NF is so important in practice 51

Page 52: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

ZipCodes(Street, Canton, City, Zip)ZipCodes is not in BCNF{Zip} {Canton, City} // evil{Street, Canton, City} {Zip} // okay

Redundancy in ZipCodes (Rämistr., Zürich, Zürich, 8006) (Universitätsstr., Zürich, Zürich, 8006) (Schmid-Str., Zürich, Zürich, 8006)stores several times that 8006 belongs to Zürich

Exercise: How would you model ZipCodes in ER? What would the relational schema look like?

52

Page 53: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition Algorithm (BCNF) Input: R Output: R1, ..., Rn such that

R1, ..., Rn is a lossless decomposition of R.

R1, ..., Rn are in BCNF.

(Preservation of dependencies is not guaranteed.)

53

Page 54: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition Algorithm Input: R Output: R1, ..., Rn

result = {R}while ( Ri Z: Ri is not in BCNF))

let be evil in Ri Ri1 = Ri2 = Ri - result = (result – {Ri}) {Ri1} {Ri2}

output(result)

54

Page 55: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Visualization of Decomposition Algo

Ri

Ri1

Ri2

Ri –()

55

Page 56: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition of ZipCodesZipCodes: {[Street, City, Canton, Zip]}{Zip} {City, Canton} // evil{Street, City, Canton} {Zip} // okay

Applying the decomposition algorithm...Street: {[Zip, Street]}Cities: {[Zip, City, Canton]}

Assessmentdecomposition is losslessdecomposition does not preserve dependencies

56

Page 57: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Cities is not in BCNFCities: {[City, Canton, Direktion, Population]}FDs of Cities:

{City, Canton} {Population} {Canton} {Direktion} {Direktion} {Canton}

Keys: {City, Canton}

In which highest NF is Cities?N.B. decomposition algo can also be applied to non

3NF!

Köln NRW Rütgers

1mio

Bonn NRW Rütgers

200K

Aachen

NRW Rütgers

200K

57

Page 58: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition of CitiesCities: {[City, Canton, Direktion, Population]}{Canton} {Direktion} // evil{Direktion} {Canton} // evil{City, Canton} {Population} // okay

Ri1: Administration: {[Canton, Direktion]}

Ri2: Cities: {[City, Canton, Population]}

Is this decomposition lossless? Preserves depend.?

58

Page 59: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Fourth Normal Form

Give FDs, keys. Which highest normal form?Does „Skills“ have redundancy? What happens if 3002 learns a new language?

How would you model Skills in ER? How translated?

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3002 Greek Pascal3002 Latin C3005 German Ada

59

Page 60: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

NFNF

This model would work: no redundancy, no anomolies

But, unfortunately, not implementable in SQL 2 It is implementable in SQL 3 and XMLThat is why design theory needs to be adapted to XML

SkillsPersNr Languag

eProgrammin

g3002 {Greek,

Latin}{C, Pascal}

3005 {German} {Ada}

60

Page 61: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

What is wrong with this?

Who knows Greek and Pascal?Anomolies? What happens if 3002 learns a new language?

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3005 German Ada

61

Page 62: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Multi-value Dependencies

A B A C

R

A B Ca b c

a bb cc

a bb c

a b cc

62

Page 63: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Multi-value Dependencies (MVD)

iff t1, t2 R: t1. = t2. t3, t4 R:t3. = t4. = t1. = t2.t3. = t1.t4. = t2.t3. = t2.t4. = t1.

R

A1 ... Ai

Ai+1 ... Aj

Aj+1 ...

Ana1 ... ai ai+1 ... aj aj+1 ... ana1 ... ai bi+1 ... bj bj+1 ... bna1 ... ai bi+1 ... bj aj+1 ... ana1 ... ai ai+1 ... aj bj+1 ... bn

63

Page 64: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

MVD: Example

MVDs of Skills{PersNr}{Language}{PersNr}{Programming}{Language} {PersNr, Programming} (???)

MVDs can result in anomalies and redundancy

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3002 Greek Pascal3002 Latin C3005 German Ada

64

Page 65: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Are MVDs symmetric?NO!NOT {Language}{PersNr}

Exercise: Find examples for symmetric MVDs!

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3002 Greek Pascal3002 Latin C3005 German Ada3007 Greek XQuery

65

Page 66: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

MVD: Example

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3002 Greek Pascal3002 Latin C3005 German Ada

LanguagePersNr Languag

e3002 Greek3002 Latin3005 German

ProgrammingPersNr Programmin

g3002 C3002 Pascal3005 Ada

PersNr, Language PersNr, Programming

66

Page 67: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

MVD: Example

SkillsPersNr Languag

eProgrammin

g3002 Greek C3002 Latin Pascal3002 Greek Pascal3002 Latin C3005 German Ada

LanguagePersNr Languag

e3002 Greek3002 Latin3005 German

ProgrammingPersNr Programmin

g3002 C3002 Pascal3005 Ada

JoinA

67

Page 68: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example: Drinkersdrinkers: {[name, addr, phones, beersLiked]}some people have several phonessome people like several kind of beers

FDs and MVDsname addrname phonesname beersLiked

Again, MVDs indicate redundancyphones and beersLiked are independent concepts (Would work if you could store them as sets: NFNF.)

68

Page 69: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Let R = R1 R2 R1 := ΠR1 (R) R2 := ΠR2 (R)

Lemma: The decomposition is lossless iff (R1 R2) R1 or (R1 R2) R2

Exercise: Proof of this Lemma.Which direction is easier?

RR1

R2

Lossless Decompositions with MVDs

69

Page 70: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Laws of MVDsTrivial MVDs: RCheck criterion of MVDs ( = R, )

t1, t2 R: t1. = t2. t3, t4 R:t3. = t4. = t1. = t2.t3. = t1.t4. = t2.t3. = t2.t4. = t1.

let t1, t2 R: t1. = t2. set t3=t1, t4=t2 t3. = t4. = t1. = t2. (by def. of t3, t4)t3. = t1.(by def. of t3)t4. = t2.(by def. of t4)t3. = t2.()t4. = t1.() qed

Adapt proof for: R - 70

Page 71: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Laws of MVDsPromotion:

let t1, t2 R: t1. = t2. (1) t1. = t2. ()

set t3=t2, t4=t1 t3. = t4. = t1. = t2. (by def. of t3, t4)t3. = t1.(1)t4. = t2.(1)t3. = t2.(by def. of t3)t4. = t1.(by def. of t4) qed

does not always hold!e.g., PersNr Language, but PersNr Language

71

Page 72: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Laws of MVDs Reflexivity: ( )

Augmentation:

Transitivity:

Complement: R

Multi-value Augmentation: (

Multi-value Transitivity:

Generalization (Promotion): 72

Page 73: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Laws of MVDs (ctd.) Coalesce: ( ) ( = )

Multi-value Union:

Intersection:

Minus: –

Warning: NOT ( Not all rules of FDs apply to MVDs! Exercise: find an example for which this law does not

hold

73

Page 74: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

WARNINGNOT ( Not all rules of FDs apply to MVDs!E.g., {Language} {PersNr, Programming}But, NOT {Language} {PersNr}And NOT {Language} {Programming}

Be careful with MVDs Not a totally intuitive concept

74

Page 75: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Trivial MVDs is trivial iff or = R -

Proof: a trivial MVD holds for any relation.

75

Page 76: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Fourth Normal Form (4NF ) R is in 4NF iff for all at least one condition holds: is trivialis a superkey of R

R in 4NF implies R in BCNFProof is based on . (Can you prove that?)

76

Page 77: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition Algorithm for 4NF Input: R Output: R1, ..., Rn

result = {R}while ( Ri Z: Ri is not in 4NF))

let be evil in Ri Ri1 = Ri2 = Ri - result = (result – {Ri}) {Ri1} {Ri2}

output(result)

77

Page 78: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Decomposition into 4 NF

78

Page 79: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Example Assistant: {[PersNr, Name, Area, Boss, Language,

Progr.]}

Synthesis Algorithm (3NF)Assistant: {[PersNr, Name, Area, Boss]}Skills: {[PersNr, Language, Programming]}

Decomposition Algorithm (4NF)Assistant: {[PersNr, Name, Area, Boss]}Languages: {[PersNr, Language]}Programming: {[PersNr, Programming]}

79

Page 80: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Summary

Lossless decomposition up to 4NFPreserving dependencies up to 3NF

(implementable in SQL92)

SynthesisAlgorithm

DecompositionAlgorithm

lossless &preserv. dep.

lossless

80

Page 81: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Exercise

Find FDs, MVDs, keys Decompose into 3NF, BCNF, 4NF

Family TreeChild Father Mother Grandpa GrandmaSofie Alfons Sabine Lothar LindeSofie Alfons Sabine Hubert LisaNiklas Alfons Sabine Lothar LindeNiklas Alfons Sabine Hubert LisaTobias Leo Bertha Hubert Martha

… … … … …

81

Page 82: Relational Design Theory  Assess the quality of a schema  redundancy  integrity constraints  Quality seal: normal forms (1-4, BCNF)  Improve the quality.

Exercise: 1:N & N:M Relationships

R

A B

S

C D

T

E F

V

G H

1

N

N M

N

M

UR: {[ A , B , C , D , E , F , G , H ]}82