Post on 11-Jan-2016
Normalization
• Also called “loss-less decomposition”
• Process of optimizing table structures to eliminate redundancy and avoid anomalies and problems with extensibility.
• Supports the golden rule: Each fact should be stored in the database only once.
• Does not provide the solution to all design problems but provides a solid foundation.
Normal Forms
• 1st Normal Form
• 2nd Normal Form
• 3rd Normal Form
• BCNF
• 4th Normal Form
• 5th Normal Form
• Domain-Key Normal Form
1st Normal Form
• The relation has no identifiable primary key.
• Any attempt has been made to store a multi-valued fact in a tuple.
First Normal Form is violated if:
1st NF - Example
• Query-ability
• Join-ability
• Constrain-ability
• Extensibility (of Language Domain)
• Extensibility (of Schema)
Evaluate the design solutions on the next four slides for:
1NF Example – Schema 1 (correct)
Programs TableEMPID LANGUAGE
2323
32
233132
COBOL
SQLSQL
SQLJAVA
JAVA
EMPID LNAME FNAME DEPT PHONE SALARY
23 Jones Mark ITR 555-1087 4500025 Smith Sara FINC 555-2222 5500026 Billings David ACTG 555-4356 4200031 Dance Ivanna ACTG 444-4887 6000032 Jones Mary ITR 555-8745 7000035 Barker Bob ACTG 555-6565 4400036 Woods Robin ITR 555-9812 9000037 Jones Mary FINC 555-1234 56000
Employees TableSEX
MFMFFMMF
3232
37
363636
VB
VBSQL
COBOLJAVA
COBOL
NAME
COBOL
SQLJAVA
VB
FULLNAME
COmmon Business Oriented Language
Structured Query LanguageJAVA
Visual Basic
Languages Table37 SQL
1NF Example – Schema 2 (incorrect)
LANGUAGES
COBOL, JAVA, SQL
SQLJAVA, SQL, VB, COBOL
EMPID LNAME FNAME DEPT PHONE SALARY
23 Jones Mark ITR 555-1087 4500025 Smith Sara FINC 555-2222 5500026 Billings David ACTG 555-4356 4200031 Dance Ivanna ACTG 444-4887 6000032 Jones Mary ITR 555-8745 7000035 Barker Bob ACTG 555-6565 4400036 Woods Robin ITR 555-9812 9000037 Jones Mary FINC 555-1234 56000
Employees TableSEX
MFMFFMMF
VB, SQL, JAVACOBOL, SQL
NAME
COBOL
SQLJAVA
VB
FULLNAME
COmmon Business Oriented Language
Structured Query LanguageJAVA
Visual Basic
Languages Table
1NF Example – Schema 3 (incorrect)
LANG1EMPID LNAME FNAME DEPT PHONE SALARY
23 Jones Mark ITR 555-1087 4500025 Smith Sara FINC 555-2222 5500026 Billings David ACTG 555-4356 4200031 Dance Ivanna ACTG 444-4887 6000032 Jones Mary ITR 555-8745 7000035 Barker Bob ACTG 555-6565 4400036 Woods Robin ITR 555-9812 9000037 Jones Mary FINC 555-1234 56000
Employees TableSEX
MFMFFMMF
NAME
COBOL
SQLJAVA
VB
FULLNAME
COmmon Business Oriented Language
Structured Query LanguageJAVA
Visual Basic
Languages Table
COBOL SQL
SQLSQLJAVA
JAVA
VB
VB SQLCOBOL
JAVA
COBOL
SQL
LANG2 LANG3 LANG4
1NF Example – Schema 4 (incorrect)
COBOLEMPID LNAME FNAME DEPT PHONE SALARY
23 Jones Mark ITR 555-1087 4500025 Smith Sara FINC 555-2222 5500026 Billings David ACTG 555-4356 4200031 Dance Ivanna ACTG 444-4887 6000032 Jones Mary ITR 555-8745 7000035 Barker Bob ACTG 555-6565 4400036 Woods Robin ITR 555-9812 9000037 Jones Mary FINC 555-1234 56000
Employees TableSEX
MFMFFMMF
NAME
COBOL
SQLJAVA
VB
FULLNAME
COmmon Business Oriented Language
Structured Query LanguageJAVA
Visual Basic
Languages Table
T T
FTT
T
T
F TT
T
T
F
JAVA SQL VB
FF F F FF F F F
F F F F
F T F
TT F
2nd Normal Form
• First Normal Form is violated
• If there exists a non-key field(s) which is functionally dependent on a partial key.
partial key non-key
Second Normal Form is violated if:
2NF Example – Raw Data
JE #1 02-JAN-2003100 Cash 310 Smith-Capital(owner investment)
20,00020,000
JE #2 03-JAN-2003100 Cash 220 Notes Payable(borrowed money)
30,00030,000
JE #3 03-JAN-2003120 Supplies 100 Cash 220 Notes Payable(purchased supplies)
5,0001,0004,000
2NF Example – Violation
JENO LINENO DESCRIPTION ACCTNO ACCTNAME AMOUNT
1 1 Owner investment 100 Cash 20,000
1 2 Owner investment 310 Smith-Capital (20,000)
2 1 Borrowed money 100 Cash 30,000
2 2 Borrowed money 220 Notes Payable (30,000)
3 1 Purchased Supplies 120 Supplies 5,000
3 2 Purchased Supplies 100 Cash (1,000)
3 3 Purchased Supplies 220 Notes Payable (4,000)
Transactions TableDATE
02-JAN-2003
03-JAN-2003
02-JAN-2003
03-JAN-2003
03-JAN-2003
03-JAN-2003
03-JAN-2003
Is there a non-key field which is functional dependenton a partial key?
2NF Example – ViolationFDs that indicate violation of 2NF
JENO LINENO DESCRIPTION ACCTNO ACCTNAME AMOUNT
1 1 Owner investment 100 Cash 20,000
1 2 Owner investment 310 Smith-Capital (20,000)
2 1 Borrowed money 100 Cash 30,000
2 2 Borrowed money 220 Notes Payable (30,000)
3 1 Purchased Supplies 120 Supplies 5,000
3 2 Purchased Supplies 100 Cash (1,000)
3 3 Purchased Supplies 220 Notes Payable (4,000)
DATE
02-JAN-2003
03-JAN-2003
02-JAN-2003
03-JAN-2003
03-JAN-2003
03-JAN-2003
03-JAN-2003
2NF Example – Corrected
JENO LINENO ACCTNO ACCTNAME AMOUNT
1 1 100 Cash 20,000
1 2 310 Smith-Capital (20,000)
2 1 100 Cash 30,000
2 2 220 Notes Payable (30,000)
3 1 120 Supplies 5,000
3 2 100 Cash (1,000)
3 3 220 Notes Payable (4,000)
Transactions Table
JENO DESCRIPTION
1 Owner investment
2 Borrowed money
3 Purchased Supplies
DATE
02-JAN-2003
03-JAN-2003
03-JAN-2003
Journal_Entry Table
3rd Normal Form
• Second Normal Form is violated
• If there exists a non-key field(s) which is functionally dependent on another non-key field(s).
non-key non-key
Third Normal Form is violated if:
Note: A candidate key is not a non-key field.
3NF Example – Violation
JENO LINENO ACCTNO ACCTNAME AMOUNT
1 1 100 Cash 20,000
1 2 310 Smith-Capital (20,000)
2 1 100 Cash 30,000
2 2 220 Notes Payable (30,000)
3 1 120 Supplies 5,000
3 2 100 Cash (1,000)
3 3 220 Notes Payable (4,000)
Transactions Table
JENO DESCRIPTION
1 Owner investment
2 Borrowed money
3 Purchased Supplies
DATE
02-JAN-2003
03-JAN-2003
03-JAN-2003
Journal_Entry TableAre there any non-key fields which functional determine another non-key field?
Are there any redundant facts?
3NF Example – ViolationFD that indicates violation of 3NF
JENO LINENO ACCTNO ACCTNAME AMOUNT
1 1 100 Cash 20,000
1 2 310 Smith-Capital (20,000)
2 1 100 Cash 30,000
2 2 220 Notes Payable (30,000)
3 1 120 Supplies 5,000
3 2 100 Cash (1,000)
3 3 220 Notes Payable (4,000)
JENO DESCRIPTION
1 Owner investment
2 Borrowed money
3 Purchased Supplies
DATE
02-JAN-2003
03-JAN-2003
03-JAN-2003
Journal_Entry TableAnomalies if not corrected:
• update (if name of account 100 changes it must be changed in multiple places risking inconsistancy) • deletion (can't delete JE#3 and its transactions without losing information about account 120)• insertion (can't set up a new account, Jones-capital, for a new partner unless we first have a transaction involving that account.
3NF Example – Corrected
JENO LINENO ACCTNO AMOUNT
1 1 100 20,000
1 2 310 (20,000)
2 1 100 30,000
2 2 220 (30,000)
3 1 120 5,000
3 2 100 (1,000)
3 3 220 (4,000)
JENO DESCRIPTION
1 Owner investment
2 Borrowed money
3 Purchased Supplies
DATE
02-JAN-2003
03-JAN-2003
03-JAN-2003
Journal_Entry Table
Transactions Table
ACCTNO ACCTNAME
100 Cash
310 Smith-Capital
220 Notes Payable
120 Supplies
Accounts Table
3NF Example – CorrectedFinal Dependencies
JENO LINENO ACCTNO AMOUNT
1 1 100 20,000
1 2 310 (20,000)
2 1 100 30,000
2 2 220 (30,000)
3 1 120 5,000
3 2 100 (1,000)
3 3 220 (4,000)
JENO DESCRIPTION
1 Owner investment
2 Borrowed money
3 Purchased Supplies
DATE
02-JAN-2003
03-JAN-2003
03-JAN-2003
ACCTNO ACCTNAME
100 Cash
310 Smith-Capital
220 Notes Payable
120 Supplies
All non-key fieldsare FD on the PKand only the PK.
BCNF Normal Form
• Third Normal Form is violated
• If there exists a partial key which is functionally dependent on a non-key field(s).
non-key partial-key
Boyce-Codd Normal Form is violated if:
BCNF ExampleSemantics
• A student can have more than one major
• A student has a different advisor for each major.
• Each advisor advises for only one major.
BCNF Example – Violation
SID MAJOR ADVISOR
1 PHYSICS EINSTEIN
1 BIOLOGY LIVINGSTON
2 PHYSICS BOHR
2 COMPUTER SCIENCE CODD
3 PHYSICS EINSTEIN
4 BIOLOGY LIVINGSTON
4 ACCOUNTING PACIOLI
5 PHYSICS EINSTEIN
6 PHYSICS BOHR
6 BIOLOGY DARWIN
7 COMPUTER SCIENCE CODD
7 BIOLOGY DARWIN
Student_Majors Table
Does this relation violate third normal form?Are there any redundant facts?
BCNF Example – ViolationFD that violates BCNF
SID MAJOR ADVISOR
1 PHYSICS EINSTEIN
1 BIOLOGY LIVINGSTON
2 PHYSICS BOHR
2 COMPUTER SCIENCE CODD
3 PHYSICS EINSTEIN
4 BIOLOGY LIVINGSTON
4 ACCOUNTING PACIOLI
5 PHYSICS EINSTEIN
6 PHYSICS BOHR
6 BIOLOGY DARWIN
7 COMPUTER SCIENCE CODD
7 BIOLOGY DARWIN
It is importantthat you convinceyourself that majordoes not FDadvisor.
BCNF Example – Corrected
SID ADVISOR
1 EINSTEIN
1 LIVINGSTON
2 BOHR
2 CODD
3 EINSTEIN
4 LIVINGSTON
4 PACIOLI
5 EINSTEIN
6 BOHR
6 DARWIN
7 CODD
7 DARWIN
MAJORADVISOR
PHYSICSEINSTEIN
BIOLOGYLIVINGSTON
PHYSICSBOHR
COMPUTER SCIENCECODD
ACCOUNTINGPACIOLI
BIOLOGYDARWIN
Student_Advisors Table
Advisors Table
Note that the if the original key, counter-intuitively, in schema 1had been defined as SID & ADVISORthis would have been a 2NF violation.
4th Normal Form
• Boyce Codd Normal Form is violated• If there exists a partial key which has
multiple independent multi-valued functional dependencies to other partial keys.
partial-key1 partial-key2 partial-key3
4th Normal Form is violated if:
4NF Example – Violation
Name Language
Fred French
Fred Italian
Fred Spanish
Instrument
Piano
Flute
Flute
Instruments_Languages
Jane French
Jane French
Piano
Oboe
Sam French
Sam Spanish
Sam Spanish
Piano
Oboe
Flute
4NF Example – Violation
Name Language
Fred French
Fred Italian
Fred Spanish
Instrument
Piano
Flute
Flute
Jane French
Jane French
Piano
Oboe
Sam French
Sam Spanish
Sam Spanish
Piano
Oboe
Flute
Does this relation violate 1st, 2nd, 3rd, or BCNF?Are there any redundant facts?
4NF Example – Correction
Name Language
Fred French
Fred Italian
Fred Spanish
LanguagesSpoken
Jane French
Sam French
Sam Spanish
Name
Fred
Fred
Instrument
Piano
Flute
InstrumentsPlayed
Jane
Jane
Piano
Oboe
Sam
Sam
Sam
Piano
Oboe
Flute