How to Fake a Database Design
-
Upload
curtis-poe -
Category
Software
-
view
669 -
download
6
description
Transcript of How to Fake a Database Design
April 7, 2023
How to Fake a Database Design
How do I spell “normalization”?OSCON 2014
Curtis "Ovid" Poehttp://allaroundtheworld.fr/
Copyright 2014, http://www.allaroundtheworld.fr/
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Good Database Schemas
• Generally normalized• Denormalized only as necessary• No duplicate data
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Typical Developer Schemas
• A steaming pile of ones and zeros• … with a “family friendly” background
Source: http://commons.wikimedia.org/wiki/File:Spaghetti-prepared.jpg
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Database Normalization
• Remove redundancy• Create logical relations• Decomposing data to atomic elements
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Only Covering 3NF
1. Remove repeating groups of data2. Remove partial key dependencies3. Remove data unrelated to key
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
How to Feel Stupid“It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins.”
Simple Conditions for Guaranteeing Higher Normal Forms in Relational Databases — C. J. Date
http://commons.wikimedia.org/wiki/File:%22I_should_have_gone_to_the_pro_station%22_-_NARA_-_514564.tif
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
‘Nuff of that – Let’s Get Started
I’m going to discuss “how”, not “why”,because I only have 50 minutes.
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Faking a Database Design
• Forget everything you know about Excel• Focus on nouns (sort of)• Duplicate data is a design flaw
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Real-World Problem
• Client wanted a rewrite of recipes site• They sent us their Access (!) database• Main objects:– customers– recipes– orders
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our “DBA” Said This Was OK
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our “DBA” also lost his job shortly thereafter
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Back to the plot …
• Customers• Orders• Recipes
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Nouns == Tables(*)
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Nouns == Tables(*)
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #1
1. Nouns == tables
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
What’s with the customer_id?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
It’s a foreign key
One-to-many relationship
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our DDL (Data Definition Language)
CREATE TABLE orders ( order_id SERIAL PRIMARY KEY, customer_id INTEGER NOT NULL, order_date TIMESTAMP WITH TIME ZONE NOT NULL, FOREIGN KEY (customer_id) REFERENCES customer(customer_id));
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #2
1. Nouns == tables2. Another table’s ID must have a FK constraint
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Oh dog, no!
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
But “What if”?1. fettuccinne2. fettuchini3. fettucini4. fettucinne5. fetuchine6. fetuchinney7. fetuchinni8. fetucine9. fetucini10. fetucinni
https://www.flickr.com/photos/ykjc9/3485366680/sizes/l
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
SearchingSELECT recipe_id, name FROM recipes WHEREingredient1 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient2 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient3 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient4 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient5 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient6 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient7 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') ORingredient8 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney',
'fetuchinni', 'fetucine', 'fetucini', 'fetucinni');
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
It’s “fettuccine”, in caseyou were wondering
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Searching
SELECT recipe_id, name FROM recipes WHERE ingredient1 = 'fettuccine' OR ingredient2 = 'fettuccine' OR ingredient3 = 'fettuccine' OR ingredient4 = 'fettuccine' OR ingredient5 = 'fettuccine' OR ingredient6 = 'fettuccine' OR ingredient7 = 'fettuccine' OR ingredient8 = 'fettuccine';
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Ingredients Table
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #3
1. Nouns == tables2. Another table’s ID must have a FK constraint3. Lists of things get their own table
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Lookup Table
Many-to-many relationship
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Searching
SELECT recipe_id, name FROM recipes r JOIN recipe_ingredients ri ON ri.recipe_id = r.recipe_id JOIN ingredients i ON i.ingredient_id =
ri.ingredient_id WHERE i.name = 'fettuccine';
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our DDL (Data Definition Language)
CREATE TABLE recipes_ingredients ( recipe_ingredient_id SERIAL PRIMARY KEY, recipe_id INTEGER NOT NULL, ingredient_id INTEGER NOT NULL, UNIQUE(recipe_id, ingredient_id), FOREIGN KEY (recipe_id) REFERENCES recipes(recipe_id), FOREIGN KEY (ingredient_id) REFERENCES ingredients(ingredient_id));
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our DDL (Data Definition Language)
CREATE TABLE recipes_ingredients ( recipe_id INTEGER NOT NULL, ingredient_id INTEGER NOT NULL, PRIMARY KEY (recipe_id, ingredient_id), FOREIGN KEY (recipe_id) REFERENCES recipes(recipe_id), FOREIGN KEY (ingredient_id) REFERENCES recipes(ingredient_id));
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #4
1. Nouns == tables2. Another table’s ID must have a FK constraint3. Lists of things get their own table4. Many-to-many == lookup table (with FKs)
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
So How Do We Order Recipes?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Orders With Recipes
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
How Many of Which Ingredient?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our simple “customers”, “orders”, and “recipes”database has grown to seven tables.
And it will keep growing.
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
So Far
• Every noun has its own table (*)• Lookup tables join related tables• And generally have some of unique constraint• Other table’s ids have foreign key constraints
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Database Tips
• We’ve covered the main rules• They only cover structure• Now to dive deeper
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Equality ≠ Identity
• No duplication == not duplicating identity• Are identical twins the same person?• Are two guys named “John” the same guy?• This is important and easy to get wrong• For example …
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
How do you get the total of an order?
• Assume each recipe has a price• Store total in the order? (hint: no)• Store price on the recipe? (hint: yes)• Is that enough?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Orders Total
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Calculating the Order Total? SELECT o.order_id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.order_id JOIN recipes r ON r.recipe_id = orr.recipe_idGROUP BY o.order_id
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
What if the price changes?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Orders Total
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Calculating the Order Total SELECT o.order_id, sum(orr.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.order_idGROUP BY o.order_id
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Equality is not Identity
• Order item price isn’t item price• What if the item price changes?• What if you give a discount on the order item?• A subtle, but common bug
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #5
1. Nouns == tables2. Another table’s ID must have a FK constraint3. Lists of things get their own table4. Many-to-many == lookup table (with FKs)5. Watch for equal values that aren’t identical
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Naming
• Names are important• Identical columns should have identical names• Names should hint at use
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Bad Naming
SELECT name, 'too cold' FROM areas WHERE temperature < 32;
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
ID Names
orders.order_idversus
orders.id
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
ID Names
SELECT o.id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.id JOIN recipes r on r.id = o.idGROUP BY o.order_id
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
ID Names
SELECT o.id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.id JOIN recipes r on r.id = o.idGROUP BY o.order_id
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Conceptually Similar to …
SELECT name FROM customer WHERE id > weight;
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
ID Names SELECT thread.* FROM email thread JOIN email selected ON selected.id = thread.id JOIN character recipient ON recipient.id = thread.recipient_id JOIN station_area sa ON sa.id = recipient.id JOIN station st ON st.id = sa.id JOIN star origin ON origin.id = thread.id JOIN star destination ON destination.id = st.idLEFT JOIN route ON ( route.from_id = origin.id AND route.to_id = destination.id ) WHERE selected.id = ? AND ( thread.sender_id = ? OR ( thread.recipient_id = ? AND ( origin.id = destination.id OR ( route.distance IS NOT NULL AND now() >= thread.datesent + ( route.distance * interval '30 seconds' ) ))))ORDER BY datesent ASC, thread.parent_id ASC NULLS FIRST
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #6
1. Nouns == tables2. Another table’s ID must have a FK constraint3. Lists of things get their own table4. Many-to-many == lookup table (with FKs)5. Watch for equal values that aren’t identical6. Name columns as descriptively as possible
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Summary
• Nouns == tables (*)• FK constraints• Proper naming is important• Your DBAs will thank you• Your apps will be more robust
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
?http://www.slideshare.net/ovid/
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Bonus Slides!
Super-duper important stuff I wasn’t sure I had time to cover because it’s
going to make your head hurt.
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Avoid NULL Values
• Every column should have a type• NULLs, by definition, are unknown values• Thus, their type is unknown• But … every column should have a type?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Our employees TableCREATE TABLE employees ( employee_id SERIAL PRIMARY KEY, name CHARACTER VARYING(255) NOT NULL, salary MONEY NULL);
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Giving Bonuses
• $1,000 bonus to all employees• … if they make less than $40,000/year
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Get Employees For Bonus
SELECT employee_id, name FROM employee WHERE salary < 40000;
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Bad SQL
• Won’t return anyone with a NULL salary• Why is the salary NULL?– What if it’s confidential?– What if they’re a contractor and in that table?– What if they’re an unpaid slave intern?– What if it’s unknown when the data was entered?
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
NULLs tell you nothing
supplier_id city
s1 ‘London’
part_id cityp1 NULL
suppliers table
parts table
Example via “Database In Depth” by C.J. Date
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
NULLs tell you nothing
part_id cityp1 NULL
parts table
Example via “Database In Depth” by C.J. Date
SELECT part_id FROM parts;
SELECT part_id FROM parts WHERE city = city;
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
NULLs tell you nothing
supplier_id city
s1 ‘London’
part_id cityp1 NULL
Example via “Database In Depth” by C.J. Date
SELECT s.supplier_id, p.part_idFROM suppliers s, parts pWHERE p.city <> s.city -- can’t compare NULL OR p.city <> 'Paris’; -- can’t compare NULL
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
NULLs tell you lies
Example via “Database In Depth” by C.J. Date
SELECT s.supplier_id, p.part_idFROM suppliers s, parts pWHERE p.city <> s.city -- can’t compare NULL OR p.city <> 'Paris’; -- can’t compare NULL
• We get no rows because we can’t compare a NULL city• The unknown city is Paris or it isn't.• If it’s Paris, the first condition is true• If it’s not Paris, the second condition is true• Thus, the WHERE clause must be true, but it’s not
April 7, 2023 Copyright 2014, http://www.allaroundtheworld.fr/
Rule #7
1. Nouns == tables2. Another table’s ID must have an FK constraint3. Lists of things get their own table4. Many-to-many == lookup table (with FKs)5. Watch for equal values that aren’t identical6. Name columns as descriptively as possible7. Avoid NULL columns like the plague