Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...•...
Transcript of Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...•...
![Page 1: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/1.jpg)
Schema Refinement and Normal Forms
V2.0 2011/09/09 Jeffrey D. Ullman – Jean-Michel Busca
![Page 2: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/2.jpg)
Outline
1. Problem statment
• Redundancy, Anomalies
• Refinement Process, Properties
2. Functional Dependencies
3. Normal Forms
4. Decomposition Algorithm
1
![Page 3: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/3.jpg)
Why Refine Schemas?
Problems:
• Sally's address is recorded multiple times
• the same for Miller's manufacturer
• Bob does not like beer: NULL values
2
name addr beer manf Sally 123 Maple Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL
![Page 4: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/4.jpg)
Why Refine Schemas? (2)
Consequences:
1. redundancy and NULL values waste space
in the database
2. redundancy can lead to several anomalies
when updating data
3
![Page 5: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/5.jpg)
name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL
Update Anomalies
If Sally moves:
• we need to change all her tuples
• otherwise an inconsistency is created
4
![Page 6: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/6.jpg)
If Sally does not like Bud anymore:
• we loose track of Bud's manufacturer
(if Sally was the only one to like Bud)
name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL
Delete Anomalies
5
![Page 7: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/7.jpg)
If we now state that Bob likes Bud Lite:
• we need to know Bud Lite's manufacturer
(if Bob is the first to like Bud Lite)
name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River Bud Lite ???
Insert Anomalies
6
![Page 8: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/8.jpg)
Schema Refinement
The goal of schema refinement is to redesign the database schema so as to eliminate:
• redundancy and NULL values
• update, delete, insert anomalies
Schema refinement relies on the theory of Functional Dependencies.
7
![Page 9: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/9.jpg)
Example: Functional Dependencies
Two functional dependencies here:
• name determines addr (name -> addr)
• beer determines manf (beer -> manf)
• ??? can be deduced from name/beer
• this is why data is redundant
8
name addr beer manf Sally 123 Maple Bud A.B. Sally ??? Miller Pete's Tom 45 Bank Miller ???
![Page 10: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/10.jpg)
Refinement Process
How do we know refinement is needed?
• Redundancy arises when a relation schema forces an association between attributes that is not natural.
• Functional dependencies help identify such situations and suggest one or more refinements.
9
![Page 11: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/11.jpg)
Refinement Process (2)
Redundant relations are then decomposed into smaller non-redundant ones.
• Non-redundant relations are projections of the original relation.
• Example: BadDesign(name, addr, beer, manuf) is decomposed into:
- Drinkers(name, addr)
- Beers(beer, manf)
- Likes(name, beer)
10
![Page 12: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/12.jpg)
Decomposition Properties
There are two important properties of a decomposition:
1. Lossless Join : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original.
2. Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied. 11
![Page 13: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/13.jpg)
Outline
1. Problem statment
2. Functional Dependencies
• Definition, Axiomes, Closure Set
• Super Keys, Keys
3. Normal Forms
4. Decomposition Algorithm
12
![Page 14: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/14.jpg)
13
Functional Dependencies
X ->Y reads "Y functionally depends on X" or "X determines Y"
X ->Y is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they also agree on all attributes of Y:
• For any tuples t1, t2 of R
• If t1[X] = t2[X] then t1[Y] = t2[Y]
• Say "X -> Y holds in R"
![Page 15: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/15.jpg)
14
Functional Dependencies (2)
Conventions:
• X, Y, Z,… represent sets of attributes.
• A, B, C,… represent single attributes.
• no set formers in sets of attributes:
- just ABC, rather than {A,B,C }.
![Page 16: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/16.jpg)
Armstrong's Axioms
Reflexivity
If Y ⊆ X then X -> Y
Augmentation
If X -> Y then XZ -> YZ
Transitivity
If X -> Y and Y -> Z then X -> Z
15
![Page 17: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/17.jpg)
Armstrong's Axioms (2)
Derived properties:
Union
If X -> Y and X -> Z then X -> YZ
Decomposition
If X -> YZ then X -> Y and X -> Z
Pseudo-transitivity
If X -> Y and V,Y -> Z then VX -> Z
16
![Page 18: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/18.jpg)
Elementary FD
X -> A is an elementary FD iif:
1. A is a single attribute
2. A does not belong to X
3. There is no X' X such that X' -> A
When looking for FDs, we are only concerned by elementary FDs.
17
![Page 19: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/19.jpg)
18
Example: FD’s
Drinkers(name, addr, beersLiked, manf, favBeer)
Reasonable FD’s to assert:
1. name -> addr favBeer
Note this FD is the same as
name -> addr
name -> favBeer
2. beersLiked -> manf
![Page 20: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/20.jpg)
19
Example: Possible Data
name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway Voyager WickedAle Pete’s WickedAle Spock Enterprise Bud A.B. Bud
Because name -> addr Because name -> favBeer
Because beersLiked -> manf
![Page 21: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/21.jpg)
20
Keys of Relations
FDs generalize the notion of key.
K is a superkey for relation R if
K determines all attributes of R.
K is a key for R if K is a superkey,
but no proper subset of K is a
superkey: K is a minimal superkey.
![Page 22: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/22.jpg)
21
Example: Superkey
Drinkers(name, addr, beersLiked, manf, favBeer)
name, beersLiked is a superkey because together these attributes determine all the other attributes.
• name -> addr favBeer
• beersLiked -> manf
![Page 23: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/23.jpg)
22
Example: Key
name, beersLiked is a key because neither name nor beersLiked is a superkey.
• name doesn’t -> manf
• beersLiked doesn’t -> addr
There are no other keys, but lots of superkeys.
• Any superset of name, beersLiked.
![Page 24: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/24.jpg)
23
How to find relevant FDs?
1. Look for IDs:
• SSn is an ID for People.
• Thus SSn->name, SSn->firstName, etc.
2. Look for properties:
• "No two courses can meet in the same room at the same time."
• Thus: hour room -> course
![Page 25: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/25.jpg)
24
How to find relevant FDs? (2)
3. Infer other elementary FDs using
Armstrong's axioms and properties.
• If A -> B and B -> C,
• Then A -> C also holds.
![Page 26: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/26.jpg)
Outline
1. Problem statment
2. Functional Dependencies
3. Normal Forms
• Definition
• 1NF, 2NF, 3NF
4. Decomposition Algorithm
25
![Page 27: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/27.jpg)
Normal Forms
A Normal Form defines the quality of a database schema with respect to redundancy.
Several Normal Forms have been proposed, based on FDs:
• 1NF, 2NF, 3NF
• Boyce-Codd NF, 4NF, 5NF, …
26
![Page 28: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/28.jpg)
Normal Forms (2)
Note: Other NFs exist that are based on more sophisticated integrity constraints, such as multivalued dependencies and join dependencies.
NFs have increasingly restrictive requirements, i.e.
• a schema in 2NF is also in 1NF
• a schema in 3NF is also in 2NF
27
![Page 29: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/29.jpg)
First Normal Form
A schema is 1FN if all attributes are atomic, i.e. neither compound or multi-valued.
Every relational schema is 1NF by definition.
28
![Page 30: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/30.jpg)
Second Normal Form
A schema is in 2NF iif:
1. It is in 1NF
2. For every non-key attribute A, for every candidate key K, A depends on the whole key K rather than just a part of it.
Counter example:
R(A, B, C, D) with A -> C
R is not in 2NF
29
![Page 31: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/31.jpg)
30
Example: 2NF?
Drinkers(name, addr, beersLiked, manf, favBeer)
FD’s: name->addr, name->favBeer, beersLiked->manf
The only key is name, beersLiked.
adr depends on a subset of the key.
Conclusion: Drinkers is not in 2NF and thus not in 3NF either.
![Page 32: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/32.jpg)
Third Normal Form
A schema is in 3NF iif:
1. It is in 2NF
2. For every non-key attribute A, A does not depend on another non-key attribute.
Counter example:
R(A, B, C) with B -> C
R is not in 3NF
31
![Page 33: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/33.jpg)
32
Example: 3NF?
Beers(name, manf, manfAddr) FD’s: name->manf, manf->manfAddr, name->manfAddr
Only key is name.
The key has only one attribute: Beers is in 2NF.
manf->manfAddr: there is a FD between two non-key attributes.
Conclusion: Beers is not in 3NF.
![Page 34: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/34.jpg)
Outline
1. Problem statment
2. Functional Dependencies
3. Normal Forms
4. Decomposition Algorithm
33
![Page 35: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/35.jpg)
2FN Decomposition Algorithm
Start: a relation R in 1NF
For every key subset X that determines non-key attributes A, B, C, …:
• create a new relation Rx with primary key X and A, B, C, … as non-key attributes
• remove attributes A, B, C, … from original relation R
34
![Page 36: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/36.jpg)
3FN Decomposition Algorithm
Start: a relation R in 2NF
For every non-key attribute set X that determines non-key attributes A, B, C, …:
• create a new relation Rx with primary key X and A, B, C, … as non-key attributes
• remove attributes A, B, C, … from original relation R
35
![Page 37: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm](https://reader030.fdocuments.net/reader030/viewer/2022041001/5ea2e6194c5de86e5d48b4fe/html5/thumbnails/37.jpg)
3FN Decomposition Algorithm (2)
Note that Rx is not necessarily in 3NF.
If so, repeat the process on Rx.
36