Normalization Sridhar Narayan [email protected]. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW...
-
Upload
deshaun-halden -
Category
Documents
-
view
213 -
download
0
Transcript of Normalization Sridhar Narayan [email protected]. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW...
![Page 2: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/2.jpg)
SSN PNUMBER HOURS ENAME PNAME PLOC
E1 P1 20 Joe CIS Roof UNCW
E1 P2 20 Joe Restaurant Mayfaire
E2 P1 40 Joe CIS Roof UNCW
EMP_PROJ
• Something feels wrong about this design• Try adding a row – Insertion anomaly• Try deleting a row – Deletion anomaly• Try updating a row – Update anomaly
• Need a formal way to reason about what is wrong with it and how to fix it
![Page 3: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/3.jpg)
Functional Dependency
• Constraints between attribute sets in a relation
• If X and Y are sets of attributes of a relation R, and whenever two tuples in R have the same X-values they also have the same Y-values, we say that X functionally determines Y.
![Page 4: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/4.jpg)
Functional Dependency
• Written as X -> Y– X functionally determines Y– Y is functionally determined by X– X is the determinant, Y is the dependent
• Examples– SSN -> SSN (trivial dependency)– PNUMBER -> PNAME– SSN -> ENAME– SSN, PNUMBER -> HOURS
![Page 5: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/5.jpg)
Functional Dependency
• Between sets of attributes, not just single attributes
• Holds for all time, not just for a particular instance (snapshot) of a relation
• Formally states constraints that exist for the relation– These constraints are in addition to those imposed
by primary keys and foreign keys
![Page 6: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/6.jpg)
Functional dependencies and keys
• If X functionally determines all attributes of R, then X is a super key
• If X is irreducible, i.e. every member of X is essential for the functional dependencies to hold, then X is a candidate key.
• Attributes that are a part of a candidate key are key attributes
![Page 7: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/7.jpg)
Examples
Super key:– SSN, PNUMBER, PNAME -> SSN, PNUMBER, HOURS,
ENAME, PNAME, PLOC
Candidate key:– SSN, PNUMBER -> SSN, PNUMBER, HOURS, ENAME,
PNAME, PLOC
SSN PNUMBER HOURS ENAME PNAME PLOC
E1 P1 20 Joe CIS Roof UNCW
E1 P2 20 Joe Restaurant Mayfaire
E2 P1 40 Joe CIS Roof UNCW
![Page 8: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/8.jpg)
Redundancy
• If in a relation R, A -> B and A is not a candidate key for R, then R will involve some redundancy.
SSN PNUMBER HOURS ENAME PNAME PLOC
Intuitively, all functional dependencies in a relation should involve candidate keys to eliminate redundancy
![Page 9: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/9.jpg)
Normalization
• A process that utilizes functional dependencies to identify relation schemas that have an undesirable form (redundancy) and decomposes them into smaller schema in which the redundancy has been eliminated.
![Page 10: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/10.jpg)
Decomposition
• Decomposition should be– Lossless join• Allow exact recovery of the original schema (without
spurious tuples)
– Dependency preserving• Allow dependencies to be checked without requiring a
join
![Page 11: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/11.jpg)
Lossy decomposition
SSN PNUMBER HOURS ENAME
E1 P1 20 Joe
E1 P2 20 Joe
E2 P1 40 Joe
ENAME PNAME PLOC
Joe CIS Roof UNCW
Joe Restaurant Mayfaire
Joe CIS Roof UNCW
![Page 12: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/12.jpg)
Natural join to recover originalSSN PNUMBER HOURS ENAME PNAME PLOC
E1 P1 20 Joe CIS Roof UNCW
E1 P2 20 Joe Restaurant Mayfaire
E2 P1 40 Joe CIS Roof UNCW
E2 P1 40 Joe Restaurant Mayfaire
![Page 13: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/13.jpg)
Heath’s Theorem
• If relation R = {A,B,C} where A,B,C are attribute sets
• and A -> B• then R1= {A, B} and R2 = {A, C} represents a
lossless decomposition
![Page 14: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/14.jpg)
Levels of normalization
• First normal form – 1NF• Second normal form – 2NF• Third normal form – 3NF• Boyce-Codd Normal Form - BCNF
Increasingly stringent requirements
![Page 15: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/15.jpg)
Normal Forms
1NF 2NF3NF
BCNF
![Page 16: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/16.jpg)
First normal form
• Relation is in 1NF if all attribute values are atomic (By definition, all relations are in 1NF)
D_NAME D_NUM MGR_SSN D_LOCATIONS
RESEARCH 5 334619276 {Lumberton, Red Springs, Raeford}
• Assume that a department can have multiple locations, like {Lumberton, Red Springs, Raeford}• Relation not in 1NF
![Page 17: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/17.jpg)
Resolution?
D_NAME D_NUM MGR_SSN D_LOCATIONS
RESEARCH 5 334619276 Lumberton
RESEARCH 5 334619276 Red Springs
RESEARCH 5 334619276 Raeford
![Page 18: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/18.jpg)
DecompositionD_NAME D_NUM MGR_SSN D_LOCATIONS
D_NAME D_NUM MGR_SSN D_NUM D_LOCATIONS
![Page 19: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/19.jpg)
Second Normal Form: 2NF
• A relation is in 2NF if – It is in 1NF, and– If the non-key attributes are fully (irreducibly)
dependent on the primary key
![Page 20: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/20.jpg)
Example: EMP_PROJ
SSN PNUMBER HOURS ENAME PNAME PLOC
• Functional Dependencies?• SSN -> ENAME• PNUMBER -> PNAME, PLOC• {SSN, PNUMBER} -> HOURS
•Relation not in 2NF• Non-key attributes ENAME, and PLOC and PNAME, are not
fully dependent on the primary key
![Page 21: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/21.jpg)
Solution? Decompose
SSN PNUMBER ENAME PNAME PLOC1b
SSN PNUMBER HOURS1a 2NF
2NF ?
![Page 22: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/22.jpg)
Decompose further…
SSN PNUMBER PNAME PLOC2b
SSN ENAME2a 2NF
2NF ?
![Page 23: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/23.jpg)
And a little more…
SSN PNUMBER3b 3b is a part of 1a, so drop it.
PNUMBER PNAME PLOC3a 2NF
![Page 24: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/24.jpg)
2NF Normalization
SSN PNUMBER HOURS1a 2NF
SSN ENAME2a 2NF
PNUMBER PNAME PLOC3a 2NF
![Page 25: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/25.jpg)
More than one way to get here
SSN PNUMBER HOURS ENAME PNAME PLOC
PNUMBER PNAME PLOC1a 2NF
SSN PNUMBER HOURS ENAME1b Not2NF
![Page 26: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/26.jpg)
Decompose further…
SSN PNUMBER HOURS2a
SSN PNUMBER ENAME2b
2NF
Not2NF
![Page 27: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/27.jpg)
And a little bit more
SSN PNUMBER
3a SSN ENAME
3b
2NF
Redundant
![Page 28: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/28.jpg)
3NF Normalization
• A relation is in 3NF if – It is in 2NF, and– If the non-key attributes are mutually
independent. That is, no functional dependencies exist between non-key attributes.
![Page 29: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/29.jpg)
Example: EMP_DEPT
• Functional Dependencies?• SSN -> {ENAME, DOB, ADDRESS, DNUM}• DNUM -> {DNAME, DMGRSSN}
• Redundancy? • Relation in 1NF ?• 2NF ?• 3NF ?
SSN ENAME DOB ADDRESS DNUM DNAME DMGRSSN
![Page 30: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/30.jpg)
3NF Normalization
DNUM DNAME DMGRSSN
SSN ENAME DOB ADDRESS DNUM1a1b
![Page 31: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/31.jpg)
BCNF Normalization
• S# and SNAME – Supplier# and Supplier Name are unique• FDs
– S# -> SNAME– SNAME -> S#– S#,P# -> QTY– SNAME, P# -> QTY
• Candidate keys– S#, P# and SNAME, P#
S# SNAME P# QTY
S1 Acme Supply P1 100
S2 Gem Mfg P1 200
S1 Acme Supply P2 400
![Page 32: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/32.jpg)
BCNF Normalization
• Redundancy?• 1NF?• 2NF?• 3NF?
S# SNAME P# QTY
S1 Acme Supply P1 100
S2 Gem Mfg P1 200
S1 Acme Supply P2 400
![Page 33: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/33.jpg)
BCNF
• Relation is in BCNF if and only if the only determinants are candidate keys
• FDs– S# -> SNAME– SNAME -> S#– S#,P# -> QTY– SNAME, P# -> QTY
![Page 34: Normalization Sridhar Narayan narayans@uncw.edu. SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW.](https://reader035.fdocuments.net/reader035/viewer/2022070411/56649c765503460f9492aed3/html5/thumbnails/34.jpg)
BCNF Normalization
S# P# QTY
S1 P1 100
S2 P1 200
S1 P2 400
S# SNAME
S1 Acme Supply
S2 Gem Mfg
S1 Acme Supply
Two candidate keys:• S#• SNAME