2
Introduction to
Database Management Systems
(DBMS)
4
Database Management Database Management System (DBMS)System (DBMS)
Definitions:Definitions:
Data: Data: Known facts that can be Known facts that can be recorded and that have implicit meaningrecorded and that have implicit meaning
Database:Database: Collection of related data Collection of related data Ex. the names, telephone numbers and Ex. the names, telephone numbers and
addresses of all the people you knowaddresses of all the people you know
Database Management System:Database Management System: A A computerized record-keeping systemcomputerized record-keeping system
5
DBMS (Contd.)DBMS (Contd.) Goals of a Database Management System:Goals of a Database Management System:
To provide an efficient as well as convenient To provide an efficient as well as convenient
environment for accessing data in a databaseenvironment for accessing data in a database Enforce information security: database Enforce information security: database
security, concurrence control, crash recoverysecurity, concurrence control, crash recovery
It is a general purpose facility for:It is a general purpose facility for: Defining Defining database database ConstructingConstructing database database Manipulating Manipulating databasedatabase
6
Benefits of database Benefits of database approach approach
Redundancy can be reducedRedundancy can be reduced Inconsistency can be avoided Inconsistency can be avoided Data can be sharedData can be shared Standards can be enforcedStandards can be enforced Security restrictions can be appliedSecurity restrictions can be applied Integrity can be maintainedIntegrity can be maintained Data independence can be providedData independence can be provided
7
DBMS FunctionsDBMS Functions
Data DefinitionData Definition Data ManipulationData Manipulation Data Security and IntegrityData Security and Integrity Data Recovery and ConcurrencyData Recovery and Concurrency Data DictionaryData Dictionary Performance Performance
8
Database SystemDatabase System
Stored Data Defn. Stored Database
Software to access stored data
Software to process queries/programs
DBMS
Software
Application Programs/Queries
Users
DATABASE
SYSTEM
(META-DATA).
9
Database SystemDatabase System
user query Q1
Database scheme
Application program query
Q2
Query processor DDL compiler
Database manager
File manager
Physical database
Compiled query Q2 Database
description
10
Categories of Data Categories of Data ModelsModels
ConceptuaConceptuall
PhysicalPhysical RepresentationRepresentationalal
Data ModelData Model A set of concepts used to describe the A set of concepts used to describe the
structure of a databasestructure of a database By structure, we mean the data types, By structure, we mean the data types,
relationships, and constraints that relationships, and constraints that should holds for the datashould holds for the data
11
Database ArchitectureDatabase Architecture
Internal level(storage view)
Conceptual level(community user view)
External level(individual user views)
Database
12
An example of the three An example of the three levelslevels
SNo FName LName Age Salary
SNo FName LName Age Salary
SNo LName BranchNo
struct STAFF { int staffNo; int branchNo; char fName[15]; char lName[15]; struct date dateOfBirth; float salary; struct STAFF *next; /* pointer to next Staff record */};index staffNo; index branchNo; /* define indexes for staff */
BranchNo
Conceptual View
External View1
External View2
Internal View
13
SchemaSchema Schema: Description of data in terms of Schema: Description of data in terms of
a data modela data model Three-level DB Architecture defines Three-level DB Architecture defines
following schemas:following schemas: External Schema (or sub-schema)External Schema (or sub-schema)
Written using Written using external DDLexternal DDL Conceptual Schema (or schema)Conceptual Schema (or schema)
Written using Written using conceptual DDLconceptual DDL Internal SchemaInternal Schema
Written using Written using internal DDLinternal DDL or or storage structure storage structure definitiondefinition
14
Data IndependenceData Independence Change the schema at one level of a database Change the schema at one level of a database
system without a need to change the schema system without a need to change the schema at the next higher levelat the next higher level Logical data independence: Refers to the immunity Logical data independence: Refers to the immunity
of the external schemas to changes in the of the external schemas to changes in the conceptual schema e.g., add new record or fieldconceptual schema e.g., add new record or field
Physical data independence: Refers to the Physical data independence: Refers to the immunity of the conceptual schema to changes in immunity of the conceptual schema to changes in the internal schema e.g., adding new index should the internal schema e.g., adding new index should not void existing onesnot void existing ones
15
HIERARCHICAL
NETWORK
RELATIONAL
TABLEROW
COLUMN
VALUE
TYPES OF DATABASE TYPES OF DATABASE MODELSMODELS
16
DATA ANALYSIS
Entities - Attributes - Relationships - Integrity Rules
LOGICAL DESIGN
Tables - Columns - Primary Keys - Foreign Keys
PHYSICAL DESIGN
DDL for Tablespaces, Tables, Indexes
DATABASE DESIGN DATABASE DESIGN PHASESPHASES
Introduction to Introduction to Relational Relational Databases:Databases:
RDBMSRDBMS
18
Definition : RDBMSDefinition : RDBMS
It is a system in which, at a minimum :It is a system in which, at a minimum :
The data is perceived by the user as tables ( The data is perceived by the user as tables (
and nothing but tables ); andand nothing but tables ); and
The operators at the user’s disposal - e.g., The operators at the user’s disposal - e.g.,
for data retrieval - are operators that for data retrieval - are operators that
generate new tables from old, and those generate new tables from old, and those
include at least SELECT, PROJECT, and include at least SELECT, PROJECT, and
JOIN.JOIN.
19
Features of an RDBMSFeatures of an RDBMS
The ability to create multiple relations The ability to create multiple relations (tables) and enter data into them(tables) and enter data into them
An interactive query languageAn interactive query language Retrieval of information stored in Retrieval of information stored in
more than one tablemore than one table Provides a Catalog or Dictionary, Provides a Catalog or Dictionary,
which itself consists of tables ( called which itself consists of tables ( called systemsystem tables ) tables )
20
Some Important TermsSome Important Terms
Relation : Relation : a tablea table
Tuple : Tuple : a row in a tablea row in a table
Attribute : Attribute : a Column in a tablea Column in a table
Degree : Degree : number of attributesnumber of attributes
Cardinality : Cardinality : number of tuplesnumber of tuples
Primary Key : Primary Key : a unique identifier for the tablea unique identifier for the table
Domain :Domain : a pool of values from which specific a pool of values from which specific
attributes of specific relations draw their valuesattributes of specific relations draw their values
21
Properties of Relations Properties of Relations (Tables)(Tables)
There are no duplicate rows (tuples)There are no duplicate rows (tuples)
Tuples are unordered, top to bottomTuples are unordered, top to bottom
Attributes are unordered, left to rightAttributes are unordered, left to right
All attribute values are atomic ( or All attribute values are atomic ( or
scalar )scalar )
Relational databases do not allow Relational databases do not allow
repeating groupsrepeating groups
22
KeysKeys KeyKey
Super KeySuper Key
Candidate KeysCandidate Keys Primary KeyPrimary Key
Alternate KeyAlternate Key
Secondary KeysSecondary Keys
23
Keys and Referential Keys and Referential IntegrityIntegrity
sid cid grade
53666 carnatic101 C
53688 reggae203 B
53650 topology112 A
53666 history105 B
sid name age
53666 Jones 18
53688 Smith 18
53650 Smith 19
gpa
3.4
3.2
3.8
login
Jones@cs
Smith@eecs
Smith@math
Enrolled Student
Primary keyForeign key referring tosid of STUDENT relation
24
Relational Relational AlgebraAlgebra
26
Relational Query Relational Query LanguagesLanguages
Query languages: Allow manipulation Query languages: Allow manipulation and retrieval of data from a database.and retrieval of data from a database.
Relational model supports simple, Relational model supports simple, powerful QLs:powerful QLs: Strong formal foundation based on Strong formal foundation based on
logic.logic. Allows for much optimization.Allows for much optimization.
Query Languages != programming Query Languages != programming languages!languages!
27
Example InstancesExample Instancessid bid
22 101
58 103
day
10/10/99
11/12/99
sid sname age
22 Deepa 45.0
31 Laxmi 55.5
58 Roopa 35.0
rating
7
8
10
sid sname age
28 Yamuna 35.0
31 Laxmi 55.5
44 Geeta 35.0
rating
9
8
5
58 Roopa 35.010
R1
S1
S2
28
Relational AlgebraRelational Algebra
Basic operations:Basic operations: Selection Selection (( ) ) Projection Projection (() ) Cross- product Cross- product ( ( ) ) Set- difference Set- difference ( –) ( –) Union Union (( ) )
29
ProjectionProjection
sname
Yamuna
Laxmi
Geeta
rating
9
8
5
Roopa 10
age
35.0
sname, rating(S2)
age(S2)55.5
30
SelectionSelection
sid sname age
28 Yamuna 35.0
rating
9
58 Roopa 35.010
rating > 8(S2)
sname
Yamuna
rating
9
Roopa 10 sname, rating(S2) (rating > 8(S2))
31
Union, Intersection, Set Union, Intersection, Set DifferenceDifference
sid sname age
22 Deepa 45.0
31 Laxmi 55.5
58 Roopa 35.0
rating
7
8
10
44 Geeta 35.0
28 Yamuna 35.0
5
9
sid sname age
22 Deepa 45.0
rating
7
sid sname age
31 Laxmi 55.5
58 Roopa 35.0
rating
8
10
S1 S2
S1 S2
S1 S2
32
Cross- ProductCross- Product
(sid) bid
22 101
58 103
day
10/10/99
11/12/99
(sid) sname age
22 Deepa 45.0
22 Deepa 45.0
31 Laxmi 55.5
rating
7
7
8
31 Laxmi 55.5
58 Roopa 35.0
58 Roopa 35.0
8
10
10
22 101 10/10/99
58 103 11/12/99
22 101 10/10/99
58 103 11/12/99
33
JoinsJoins
Condition Join :
(sid) bid
22 101
58 103
day
10/10/99
11/12/99
(sid) sname age
22 Deepa 45.0
31 Laxmi 55.5
rating
7
8
34
Equi-JoinEqui-Join
bid
101
103
day
10/10/99
11/12/99
(sid) sname age
22 Deepa 45.0
58 Roopa 35.0
rating
7
10
35
DivisionDivision
sno pno
s1 p1
s1 p2
s1 p3
s1 p4
s2 p1
s2 p2
s3 p2
s4 p2
s4 p4
Apno
p2
pno
p1
p2
p4
pnop2p4
B1B2
B3snos1s2s3s4
sno
s1
s4
sno
s1
A/B1 A/B2 A/B3
•Not supported as a primitive operator, but useful for expressing queries like:
•Find sailors who have reserved all boats .
36
Introduction to Introduction to Query Query
OptimizationOptimization
38
Processing A High-Processing A High-level Querylevel Query
Query in a high level language
Intermediate form of query
Execution plan
Code to execute the query
SCANING, PARSING AND VALIDATING
QUERY OPTIMIZER
QUERY CODE GENERATOR
Result of query
RUNTIME DATABASE PROCESSOR
Typical steps when processing a high level query.
39
Two Main Techniques for Two Main Techniques for QueryQuery
OptimizationOptimization Heuristic Rules: A heuristic is a rule that works well Heuristic Rules: A heuristic is a rule that works well
in most of cases, but not always. General Idea:in most of cases, but not always. General Idea: Many different relational algebra expressions (and thus Many different relational algebra expressions (and thus
query trees) are equivalent.query trees) are equivalent. Transform the Transform the initial query tree initial query tree of a query into an of a query into an
equivalent equivalent final query tree final query tree that is efficient to execute.that is efficient to execute.
Cost based query optimizationCost based query optimization Estimate Estimate the cost for each execution plan, and choose the the cost for each execution plan, and choose the
one with the lowest cost.one with the lowest cost.
Can we get the best execution plan?Can we get the best execution plan?
40
Motivating ExampleMotivating Example
select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000
NLJ
SS(R2) SS(R3)
NLJ
SS(R1, “a=5000”)
41
Alternative Plans 1Alternative Plans 1(No (No Indexes)Indexes)
select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000
NLJ
SS(R1, “a=5000”) SS(R2)
NLJ
SS(R3)
42
Alternative Plans 2 Alternative Plans 2 (With Indexes)(With Indexes)
select *from R1, R2, R3where R1.r2no=R2.r2noand R2.r3no=R3.r3noand R1.a=5000
NLJ
IS(R1, “a=5000”) SS(R2)
NLJ
SS(R3)
43
Conceptual Design Conceptual Design Using theUsing the
Entity- Entity- Relationship Relationship
ModelModel
45
Overview of Database Overview of Database DesignDesign
Conceptual design : (ER Model is Conceptual design : (ER Model is used at this stage.)used at this stage.)
Schema Refinement : Schema Refinement : (Normalization)(Normalization)
Physical Database Design and Physical Database Design and Tuning Tuning
46
E R ModelingE R Modeling
Conceptual Schema DesignConceptual Schema Design Relational Calculus Relational Calculus
- Formal Language for Relational D/B. - Formal Language for Relational D/B.
Relational Calculus
Predicate Calculus Domain Calculus
SQL / Tuple Based Query By Examples
47
Design Phases…Design Phases…Requirements Collection
& Analysis
Data Requirements
Functional Requirements Conceptual Design
Logical Design
Physical Design
User Defined Operations Data Flow DiagramsSequence Diagrams, Scenarios
Entity Types, Constraints , RelationshipsNo Implementation Details.
Ensures Requirements Meets the Design
Data Model Mapping – Type of Database is identified
Internal Storage Structures / Access Path / File Organizations
48
E-R ModelingE-R Modeling
EntityEntity is anything that exists and is is anything that exists and is
distinguishabledistinguishable Entity SetEntity Set
a group of similar entitiesa group of similar entities AttributeAttribute
properties that describe an entityproperties that describe an entity RelationshipRelationship
an association between entitiesan association between entities
49
NotationsNotations
ENTITY TYPE ( REGULAR )
WEAK ENTITY TYPE
RELATIONSHIP TYPE
WEAK RELATIONSHIP TYPE
50
CREATE TABLE Employees(ssn CHAR (11),name CHAR (20),lot INTEGER,PRIMARY KEY (ssn))
Employee
ssn name lotSSN NAME LOT
123- 22- 3666Attishoo 48
231- 31- 5368Smiley 22
131- 24- 3650Smethurst 35
Entity
Entity Set
Attributes
51
Types of Relationships
student ID cardIs issued
students courseenrols in
students teststake
1 1
1M
M M
1:1
1:M
M:M
52
ER Model
Department
did dname budgetsincesince
Works_inEmployee
ssn name lot
Reports_To
supervisor Sub-ordinate
53
CREATE TABLE Works_ In(ssn CHAR (11),did INTEGER,since DATE,PRIMARY KEY (ssn, did),FOREIGN KEY (ssn)REFERENCES Employees,FOREIGN KEY (did)REFERENCES Departments)
SSN DID SINCE
123-22-3666 51 1/1/91
123-22-3666 56 3/3/93
231-31-5368 51 2/2/92
ER Model (Contd.)
Works_ In
54
ManagesDepartment
did dname budgetsince
Employee
ssn name lot
Key Constraints
55
Key Constraints for Ternary Relationships
Department
did dnamesince
Works_inEmployee
ssn name lotbudget
Location
capacityaddress
56
Participation Constraints
Department
did dname budgetsince
ManagesEmployee
ssn name lot
Works_in
since
57
policyDependent
pnameagecost
Employee
ssn name lot
Weak Entities
58
ISA (‘is a’) Hierarchies
Employee
ssn name lot
Hourly_Emp
Hrs_worked
Hrly_wages
Contract_Emp
contractidIsA
59
Employee
ssn name lot
monitors
project
pid pbudget Started on
department
did dname budget
sponsors
until
Aggregation
60
Works_ In does not allow an employee to work in a department for two or more periods (why?)
Entity vs. Attribute
Works_inDepartment
did dname budgetfrom
Employee
ssn name lot to
61
Entity vs. Attribute (Contd.)
Works_inDepartment
did dname budget
from
Employee
ssn name lot
toDuration
62
managesDepartment
did dname budgetsince
Employee
ssn name lot DB
DB - Dbudget
Entity vs. Relationship
63
managesDepartment
did dname budget
since
Employee
ssn name lot
DBudgetMgr_apptAppt num
Entity vs. Relationship
64
Dependent
pname age
cost
Employee
ssn name lot
covers
Policy
policyid
Binary vs. Ternary Relationships
65
Dependent
pnameage
cost
Employee
ssn name lot
Beneficiary
Policypolicyid
Better Design
purchaser
Binary vs. Ternary Relationships
66
• Some constraints cannot be captured in ER diagrams:
• Functional dependencies
• Inclusion dependencies
• General constraints
Constraints Beyond the ER Model
67
E-R DiagramE-R Diagram
DEPARTMENT
DEPT_EMP
EMPLOYEE
EMP_DEP
DEPENDENT
PROJ_WORK
PROJ_MGR
PROJECT
SUPPLIER
SUPP_PART_PROJ
PART
PART_STRUCTURE
SUPP_PART
MM
M
M
M
M
M
M
M M
M
M
1
1 1
68
Example to Start with ….Example to Start with ….
An Example Database Application An Example Database Application called COMPANY which serves to called COMPANY which serves to illustrate the ER Model concepts and illustrate the ER Model concepts and their schema design.their schema design.
The following are collection from the The following are collection from the Client.Client.
69
Analysis…Analysis…
Company :Company :Organized into Departments, Each Organized into Departments, Each Department has a name, no and Department has a name, no and manager who manages the manager who manages the department. The Company keeps department. The Company keeps track of the date that employee track of the date that employee managing the department. A managing the department. A Department may have a Several Department may have a Several locations.locations.
70
Analysis…Analysis…
Department :Department :A Department controls a number of Projects A Department controls a number of Projects each of which has a unique name , no and a each of which has a unique name , no and a single Location.single Location.
Employee :Employee :Name, Age, Gender, BirthDate, SSN, Name, Age, Gender, BirthDate, SSN, Address, Salary. An Employee is assigned to Address, Salary. An Employee is assigned to one department, may work on several one department, may work on several projects which are not controlled by the projects which are not controlled by the department. Track of the number of hours department. Track of the number of hours per week is also controlled.per week is also controlled.
71
Analysis….Analysis….
Keep track of the dependents of Keep track of the dependents of each employee for insurance policies each employee for insurance policies : We keep each dependant first : We keep each dependant first name, gender, Date of birth and name, gender, Date of birth and relationship to the employee.relationship to the employee.
72
Now to our Company…Now to our Company…
DEPARTMENT ( Name , Number , { Locations } , Manager, Start Date )
PROJECT( Name, Number, Location , Controlling Department )
EMPLOYEE(Name (Fname, Lname) , SSN , Gender, Address, Salary
Birthdate, Department , Supervisor , (Workson ( Project , Hrs))
DEPENDENT ( Employee, Name, Gender, Birthdate , Relationship )
73
Example …Example …
Manage:Manage: Department and Employee Department and Employee Partial Partial ParticipationParticipation
Relation Attribute : StartDate.Relation Attribute : StartDate. Works For:Works For:
Department and EmployeeDepartment and Employee Total ParticipationTotal Participation
74
Example…Example…
Control :Control : Department , ProjectDepartment , Project Partial Participation from Department Partial Participation from Department Total Participation from ProjectTotal Participation from Project Control Department is a RKA.Control Department is a RKA.
Supervisor :Supervisor : Employee, EmployeeEmployee, Employee Partial and RecursivePartial and Recursive
75
Example …Example …
Works – On :Works – On : Project , EmployeeProject , Employee Total ParticipationTotal Participation Hours Worked is a RKA.Hours Worked is a RKA.
Dependants of:Dependants of: Employee , DependantEmployee , Dependant Dependant is a WeakerDependant is a Weaker Dependant is Total , Employee is Dependant is Total , Employee is
Partial.Partial.
76
One Possible mapping of the One Possible mapping of the Problem Statement Problem Statement
Works For Department
Name No Loc
Controls
Project
Name No Loc
WorksOn
manages
Sdate
Hours
Depend On
Name Sex Bdate
Relationship
Supervise
s
Employee Address
Fname
SexSSN
Name
Bdate
Sal
Lname
Dependent
77
78
79
80
Schema Schema Refinement andRefinement andNormalizationNormalization
82
Normalization and Normalization and Normal FormsNormal Forms
Normalization:Normalization: DecomposingDecomposing a larger, complex table into several a larger, complex table into several
smaller, simpler ones.smaller, simpler ones. Move from a lower Move from a lower normal formnormal form to a higher to a higher
Normal form.Normal form. Normal Forms:Normal Forms:
First Normal Form (1NF)First Normal Form (1NF) Second Normal Form (2NF)Second Normal Form (2NF) Third Normal Form (3NF)Third Normal Form (3NF) *Higher Normal Forms (BCNF, 4NF, 5NF ....)*Higher Normal Forms (BCNF, 4NF, 5NF ....)
In practice, 3NF is often good enough.In practice, 3NF is often good enough.
83
Why Normal FormsWhy Normal Forms
The first question to ask is whether The first question to ask is whether
any refinement is needed!any refinement is needed!
If a relation is in a certain normal If a relation is in a certain normal
form (BCNF, 3NF etc.), it is known form (BCNF, 3NF etc.), it is known
that certain kinds of problems are that certain kinds of problems are
avoided/ minimized. This can be used avoided/ minimized. This can be used
to help us decide whether to help us decide whether
decomposing the relation will help.decomposing the relation will help.
84
The Evils of RedundancyThe Evils of Redundancy
Redundancy is at the root of several Redundancy is at the root of several problems associated with relational problems associated with relational schemasschemas
More seriously, data redundancy causes More seriously, data redundancy causes several anomalies: insert, update, deleteseveral anomalies: insert, update, delete
Wastage of storage.Wastage of storage. Main refinement technique: Main refinement technique:
decomposition (replacing ABCD with, decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD).say, AB and BCD, or ACD and ABD).
85
Refining an ER Diagram - Refining an ER Diagram - BeforeBefore
Department
did dname budgetsince
Works_inEmployee
ssn name lot
86
Refining an ER Diagram - Refining an ER Diagram - AfterAfter
Works_in
since
Employee
ssn name
lot
Department
did dname budget
87
First Normal FormFirst Normal Form A table is in 1NF, if every row contains exactly one A table is in 1NF, if every row contains exactly one
value for each attribute.value for each attribute. Disallow multivalued attributes, composite attributes Disallow multivalued attributes, composite attributes
and their combinations.and their combinations. 1NF states that :1NF states that :
domains of attributes must include only atomic (simple, domains of attributes must include only atomic (simple, indivisible) values and that value of any attribute in a tuple indivisible) values and that value of any attribute in a tuple must be a single value from the domain of that attribute.must be a single value from the domain of that attribute.
By definition, any relational table must be in 1NF.By definition, any relational table must be in 1NF.
88
Functional Dependencies Functional Dependencies (FDs)(FDs)
Provide a formal mechanism to Provide a formal mechanism to
express constraints between express constraints between
attributes attributes
Given a relation R, attribute Y of R is Given a relation R, attribute Y of R is
functionally dependent on the functionally dependent on the
attribute X of R if & only if each X-attribute X of R if & only if each X-
value in R has associated with it value in R has associated with it
precisely one Y-value in R.precisely one Y-value in R.
89
Full DependencyFull Dependency
Concept of full functional Concept of full functional
dependencydependency A FD x A FD x y y is a full functional is a full functional
dependency if removal of any attribute dependency if removal of any attribute
A from X means that the dependency A from X means that the dependency
does not hold any more.does not hold any more.
90
Partial DependencyPartial Dependency
An F.D. x An F.D. x y is a partial dependency y is a partial dependency
if there is some attribute A if there is some attribute A X that X that can be removed from X and the can be removed from X and the dependency will still hold.dependency will still hold.
91
Example: Constraints on Example: Constraints on Entity SetEntity Set
123- 22- 3666 Attishoo231- 31- 5368131- 24- 3650434- 26- 3751612- 67- 4134
SmileySmethurstGulduMadayan
4822353535
88558
1010
77
10
4030303240
S N L R W H
58
710
R W123- 22- 3666 Attishoo231- 31- 5368131- 24- 3650434- 26- 3751612- 67- 4134
SmileySmethurstGulduMadayan
4822353535
S N L4030303240
H8R
85
58
92
Second Normal Form Second Normal Form (2NF)(2NF)
A relation schema R is in 2NF if:A relation schema R is in 2NF if: it is in 1NF andit is in 1NF and
every non-prime attribute A in R is fully every non-prime attribute A in R is fully
functionally dependent on the primary functionally dependent on the primary
key of R.key of R.
2NF prohibits 2NF prohibits partial dependenciespartial dependencies..
93
2NF: An Example2NF: An Example Emp{Eno, Dept, ProjCode, Hours}Emp{Eno, Dept, ProjCode, Hours}
Primary key: {Eno, ProjCode}Primary key: {Eno, ProjCode} {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}{Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}
Test of 2NFTest of 2NF {Eno} -> {Dept}: {Eno} -> {Dept}: partial dependency.partial dependency. Emp is in 1NF, but not in 2NF.Emp is in 1NF, but not in 2NF.
Decomposition:Decomposition: Emp {Emp {EnoEno, Dept}, Dept} Proj {Proj {Eno, ProjCodeEno, ProjCode, Hours}, Hours}
94
Transitive DependencyTransitive Dependency
An FD X An FD X Y in a relation schema R Y in a relation schema R
is a transitive dependency if is a transitive dependency if there is a set of attributes Z that is not there is a set of attributes Z that is not
a subset of any key of R, and a subset of any key of R, and both X both X Z and Z Z and Z Y hold. Y hold.
95
Third Normal FormThird Normal Form A relation schema R is in 3NF if A relation schema R is in 3NF if
It is in 2NF and It is in 2NF and
No nonprime attribute of R is transitively No nonprime attribute of R is transitively
dependent on the primary key.dependent on the primary key.
3NF means that each non-key attribute value in any tuple 3NF means that each non-key attribute value in any tuple is truly dependent on the Primary Key and not even is truly dependent on the Primary Key and not even partially on other attributes.partially on other attributes.
3NF prohibits 3NF prohibits transitive dependenciestransitive dependencies..
96
3NF: An Example3NF: An Example Emp{Eno, Dept, Dept_Head}Emp{Eno, Dept, Dept_Head}
Primary key: {Eno}Primary key: {Eno} {Eno} -> {Dept}, {Dept} -> {Dept_Head}{Eno} -> {Dept}, {Dept} -> {Dept_Head}
Test of 3NFTest of 3NF {Eno} -> {Dept} -> {Dept_Head}: Transitive {Eno} -> {Dept} -> {Dept_Head}: Transitive
dependency.dependency. Emp is in 2NF, but not in 3NF.Emp is in 2NF, but not in 3NF.
Decomposition:Decomposition: Emp {Emp {Eno, DeptEno, Dept}} Dept {Dept, Dept_Head}Dept {Dept, Dept_Head}
97
Boyce –Codd Normal Boyce –Codd Normal FormForm
The intention of BCNF is that- 3NF The intention of BCNF is that- 3NF does not satisfactorily handle the does not satisfactorily handle the case of a relation processing two or case of a relation processing two or more composite or overlapping more composite or overlapping candidate keys candidate keys
98
BCNF ( Boyce Codd BCNF ( Boyce Codd Normal Form)Normal Form)
A Relation is said to be in Boyce A Relation is said to be in Boyce Codd Normal Form (BCNF) if and Codd Normal Form (BCNF) if and only if every determinant is a only if every determinant is a candidate key.candidate key.
99
Decomposition of a Decomposition of a Relation SchemeRelation Scheme
Suppose that relation R contains Suppose that relation R contains attributes A1 ... An. A decomposition attributes A1 ... An. A decomposition of R consists of replacing R by two of R consists of replacing R by two or more relations such that:or more relations such that: Each new relation scheme contains a Each new relation scheme contains a
subset of the attributes of R (and no subset of the attributes of R (and no attributes that do not appear in R), andattributes that do not appear in R), and
Every attribute of R appears as an Every attribute of R appears as an attribute of one of the new relations.attribute of one of the new relations.
100
101
102
103
104
105
106
Transaction, Transaction, Concurrency Concurrency Control and Control and
RecoveryRecovery
108
TransactionTransaction
A sequence of many actions which A sequence of many actions which are considered to be one atomic unit are considered to be one atomic unit of work.of work. Read, write, commit, abortRead, write, commit, abort
Governed by four ACID properties:Governed by four ACID properties: AAtomicity, tomicity, CConsistency, onsistency, IIsolation, solation,
DDurabilityurability Has a unique starting point, some Has a unique starting point, some
actions and one end pointactions and one end point
109
The ACID PropertiesThe ACID Properties
A tomicity: All actions in the A tomicity: All actions in the transaction happen, or none happen.transaction happen, or none happen.
C onsistency: If each transaction is C onsistency: If each transaction is consistent, and the DB starts consistent, and the DB starts consistent, it ends up consistent.consistent, it ends up consistent.
I solation: Execution of one I solation: Execution of one transaction is isolated from that of transaction is isolated from that of other transactions.other transactions.
D urability: If a transaction commits, D urability: If a transaction commits, its effects persist.its effects persist.
110
AutomicityAutomicity All-or-nothing, no partial results. An event either happens All-or-nothing, no partial results. An event either happens
and is committed or fails and is rolled back.and is committed or fails and is rolled back. e.g. in a money transfer, debit one account, credit the e.g. in a money transfer, debit one account, credit the
other. Either both debiting and crediting operations other. Either both debiting and crediting operations succeed, or neither of them do.succeed, or neither of them do.
Transaction failure is called AbortTransaction failure is called Abort Commit and abort are irrevocable actions. There is no undo Commit and abort are irrevocable actions. There is no undo
for these actions.for these actions. An Abort undoes operations that have already been An Abort undoes operations that have already been
executedexecuted For database operations, restore the data’s previous For database operations, restore the data’s previous
value from before the transaction (Rollback-it); a value from before the transaction (Rollback-it); a Rollback command will undo all actions taken since the Rollback command will undo all actions taken since the last commit for that user.last commit for that user.
But some real world operations are not undoable.But some real world operations are not undoable.Examples - transfer money, print ticket, fire missileExamples - transfer money, print ticket, fire missile
111
ConsistencyConsistency Every transaction should maintain DB consistencyEvery transaction should maintain DB consistency
Referential integrity - e.g. each order Referential integrity - e.g. each order references an existing customer number and references an existing customer number and existing part numbersexisting part numbers
The books balance (debits = credits, assets = The books balance (debits = credits, assets = liabilities)liabilities)
Consistency preservation is a property of a Consistency preservation is a property of a transaction, not of the database mechanisms for transaction, not of the database mechanisms for controlling it (unlike the A, I, and D of ACID)controlling it (unlike the A, I, and D of ACID)
If each transaction maintains consistency, If each transaction maintains consistency, then a serial execution of transactions does alsothen a serial execution of transactions does also
112
IsolationIsolationIntuitively, the effect of a set of transactions should Intuitively, the effect of a set of transactions should be the same as if they ran independently.be the same as if they ran independently. Formally, an interleaved execution of Formally, an interleaved execution of
transactions is serializable if its effect is transactions is serializable if its effect is equivalent to a serial one.equivalent to a serial one.
Implies a user view where the system runs each Implies a user view where the system runs each user’s transaction stand-alone.user’s transaction stand-alone.
Of course, transactions in fact run with lots of Of course, transactions in fact run with lots of concurrency, to use device parallelism – this will concurrency, to use device parallelism – this will be covered later.be covered later.
Transactions can use common data (shared data)Transactions can use common data (shared data) They can use the same data processing They can use the same data processing
mechanismsmechanisms (time sharing)(time sharing)
113
DurabilityDurability When a transaction commits, its results will survive When a transaction commits, its results will survive
failures (e.g. of the application, OS, DB system … failures (e.g. of the application, OS, DB system … even of the disk).even of the disk).
Makes it possible for a transaction to be a legal Makes it possible for a transaction to be a legal contract.contract.
Implementation is usually via a logImplementation is usually via a log DB system writes all transaction updates to a log DB system writes all transaction updates to a log
filefile to commit, it adds a record “commit(Ti)” to the logto commit, it adds a record “commit(Ti)” to the log when the commit record is on disk, the transaction when the commit record is on disk, the transaction
is committed.is committed. system waits for disk ack before acknowledging to system waits for disk ack before acknowledging to
useruser
114
Transaction processingTransaction processing
Can be automatic (controlled by the Can be automatic (controlled by the RDBMS) or programmatic RDBMS) or programmatic (programmed using SQL or other (programmed using SQL or other supported programming languages, supported programming languages, like PL/SQL)like PL/SQL)
115
Why Have Concurrent Why Have Concurrent Processes?Processes?
Better transaction throughputBetter transaction throughput Improved response time Improved response time Done via better utilization of Done via better utilization of
resources:resources: While one processes is doing a disk While one processes is doing a disk
read, another can be using the CPU or read, another can be using the CPU or reading another disk.reading another disk.
116
Typical situations requiring Typical situations requiring
concurrency control concurrency control Exclusive access to an external device or Exclusive access to an external device or
shared service (e.g., manshared service (e.g., manaaging printer ging printer queues)queues)
Coordination of applications which process Coordination of applications which process parallel parallel data (e.g. parallel DB servers)data (e.g. parallel DB servers)
Disabling or enabling execution of the client Disabling or enabling execution of the client programs in a specific moment (typically for programs in a specific moment (typically for database administration - e.g. database database administration - e.g. database backups, enforcing resource occupation, etc.)backups, enforcing resource occupation, etc.)
Detection of transaction ends when managing Detection of transaction ends when managing multiple sessions for connection to the multiple sessions for connection to the database (client/server architectures, Web database (client/server architectures, Web access)access)
117
Problems with Concurrency (in Problems with Concurrency (in absence of locking)absence of locking)
Lost Update problem - losing values Lost Update problem - losing values due to intervention of write operation due to intervention of write operation from other overlapping transactionsfrom other overlapping transactions
Temporary Update problem - Temporary Update problem - discarding previous changes made by discarding previous changes made by overlapping transaction after rollbackoverlapping transaction after rollback
Incorrect Summary problem - Incorrect Summary problem - overwriting of certain overwriting of certain
values used for calculation by write values used for calculation by write operations from other transactionsoperations from other transactions
118
Lost Update ProblemLost Update Problem
Time
T0
Transaction A
Transaction B
Value
Start A 6
T1Read Value
(6)6
T2 Add 2 (6+2=8) Read Value(6)
6
T3 Write Value (8)
Add 3 (6+3=9)
8
T4 End A Write Value (9)
9
Start B
What should the final Order Value be?What should the final Order Value be?
Which Update has been lost?Which Update has been lost?
T5 End B9
119
Temporary Update ProblemTemporary Update ProblemTime
T0
Transaction A Transaction B
Value
Start A 6
T1Read Value (6) 6
T2 Add 2 (8) 6
T3 Write Value (8)
8
T4 Failure: Rollback!
8 Read Value (8)
Start B
T5 Write Value (6) Add 3 (8+3=11)
6
Write Value (11)
T6 End A 11
What should the final Order Value be?What should the final Order Value be? Where is the temporary update?Where is the temporary update?
T5 End B11
120
Incorrect Summary ProblemIncorrect Summary Problem
Time
T0
Transaction A
Transaction BValues
T1
Read 1st Value (6)
63
T2
Add 2 (6+2=8)63
T3
Write 1st Value (8)
83
T4
83
T5
Add 2 (3+2 = 5)83
Write 2nd Value (5)
85
Read 2nd Value (3)
Read 1st Value (8)
Read 2nd Value (3)
Total Sum = 11
What should the total Order Value be? What should the total Order Value be? Which order was accumulated before update, and which after?Which order was accumulated before update, and which after?
121
3.1 Database State and Changes3.1 Database State and Changes
D1, D2 - Logically consistent states of the database data
T - Transaction for changing the databaset1, t2 - Absolute time before and after the transaction
State D1 State D2
T
t1 t2
122
active partially committed committed
aborted terminated
BEGIN
READ , WRITE
END
ROLLBACKROLLBACK
COMMIT
3.2 Transaction State and 3.2 Transaction State and ProgressProgress
A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log. It then writes a [commit, <transaction-id>] and terminates.
When a system failure occurs, search the log file for entries[start, <transaction-id>]
and if there are no logged entries [commit, <transaction-id>]then undo all operations that have logged entries
[write, <transaction-id>, X, old_value, new_value]
123
SchedulesSchedules
T1T1 T2T2R(A)R(A)W(A)W(A)
R(B)R(B)W(B)W(B)
R(C)R(C)W(C)W(C)
• Schedule: Actions of transactions as seen by the DBMS
124
Serializable ScheduleSerializable Schedule
A schedule whose effect on the DB A schedule whose effect on the DB
“state” is the same as that of some “state” is the same as that of some
serial scheduleserial schedule
All serial schedules are serializableAll serial schedules are serializable
But the reverse may not be trueBut the reverse may not be true
125
Serializability ViolationsSerializability Violations
T1T1 T2T2R(A)R(A)W(A)W(A)
R(A)R(A)W(A)W(A)R(B)R(B)W(B)W(B)
commitcommitR(B)R(B)W(B)W(B)
commitcommit
Database is Database is inconsistent!inconsistent!
Transfer Transfer Rs.10,000 Rs.10,000 from A to Bfrom A to B
Add 6% Add 6% interest to interest to A & BA & B
126
Cascading AbortsCascading Aborts
T1T1 T2T2
R(A)R(A)
W(A)W(A)
R(A)R(A)
W(A)W(A)
abortabort
127
Recoverable SchedulesRecoverable Schedules
T1T1 T2T2
R(A)R(A)
W(A)W(A)
R(A)R(A)
W(A)W(A)
commitcommit
abortabort
T1T1 T2T2
R(A)R(A)
W(A)W(A)
R(A)R(A)
W(A)W(A)
commitcommit
commitcommit
Unrecoverable Schedule Recoverable Schedule
128
LockingLocking The concept of locking data items is one of the main The concept of locking data items is one of the main
techniques for controlling the concurrent execution of techniques for controlling the concurrent execution of transactions.transactions.
A lock is a variable associated with a data item in the A lock is a variable associated with a data item in the database. database. Generally there is a lock for each data item in the Generally there is a lock for each data item in the
database.database. A lock describes the status of the data item with respect A lock describes the status of the data item with respect
to possible operations that can be applied to that item to possible operations that can be applied to that item used for synchronising the access by concurrent used for synchronising the access by concurrent
transactions to the database items.transactions to the database items. A transaction locks an object before using itA transaction locks an object before using it When an object is locked by another transaction, the When an object is locked by another transaction, the
requesting transaction must waitrequesting transaction must wait
129
Locking GranularityLocking Granularity A database item which can be locked could be A database item which can be locked could be
a database recorda database record a field value of a database recorda field value of a database record the whole databasethe whole database
Trade-offsTrade-offs coarse granularitycoarse granularity
the larger the data item size, the lower the the larger the data item size, the lower the degree of concurrencydegree of concurrency
fine granularityfine granularity the smaller the data item size, the more locks the smaller the data item size, the more locks
to be managed and stored, and the more to be managed and stored, and the more lock/unlock operations needed.lock/unlock operations needed.
130
Locking: A Technique for Locking: A Technique for Concurrency ControlConcurrency Control
---- SS XX
---- SS XX
Compatibility matrix for lock types X and S
S: Shared lockX: Exclusive lock-- No lock
•Locks are automatically obtained by DBMS.•Guarantees serializability!
131
Two- Phase Locking (2PL)Two- Phase Locking (2PL)
Strict 2PL:– If T wants to read an object, first obtains an S lock.– If T wants to modify an object, first obtains X lock.– Hold all locks until end of transaction.– Guarantees serializability, and recoverable schedule, too!
also avoids WW problems!2PL:– Slight variant of strict 2PL– transactions can release locks before the end (commit or abort)
But after releasing any lock it can acquire no new locks– Guarantees serializability
132
Handling a Lock RequestHandling a Lock Request
Lock Request (XID, OID, Mode)Lock Request (XID, OID, Mode)
Currently Locked?Currently Locked? Empty Wait Queue?Empty Wait Queue?
Currently X-locked?Currently X-locked?
Put on QueuePut on Queue
Grant LockGrant Lock
Mode==X Mode==S
No
No
No
Yes
Yes
Yes
133
134
RecoveryRecovery
Occurs in case of transaction failures.Occurs in case of transaction failures.
Database (DB) is restored to the most Database (DB) is restored to the most recent consistent state just before the time recent consistent state just before the time of failure.of failure.
To do this, the DB system needs To do this, the DB system needs information about changes applied by information about changes applied by various transactions. It is the various transactions. It is the system logsystem log..
135
Recovery: MotivationRecovery: Motivation
T1T1
T2T2
T3T3
T4T4
T5T5
crashcrash
•Atomicity: Undoing actions of transaction that do not commit•Durability: Making sure all actions of committed transactions survive system crashes•The Recovery Manager guarantees Atomicity & Durability.
136
Recovery OutlineRecovery Outline Restore to most recent “consistent” state just Restore to most recent “consistent” state just
before time of failurebefore time of failure Use data in Use data in the the log log filefile
Catastrophic FailureCatastrophic Failure Restore database from backupRestore database from backup Replay transactions from Replay transactions from loglog file file
Database becomes inconsistent (non-Database becomes inconsistent (non-catastrophic errors)catastrophic errors) Undo or Redo last transactions until Undo or Redo last transactions until cconsistent state onsistent state
is restoredis restored
137
LoggingLogging
Record REDO and UNDO Record REDO and UNDO
information, for every update, in a information, for every update, in a
log.log.
– – Sequential writes to log (put it on a Sequential writes to log (put it on a
separate disk).separate disk).
– – Minimal info (diff) written to log, so Minimal info (diff) written to log, so
multiple updates fit in a single log page.multiple updates fit in a single log page.
138
Handling the Buffer PoolHandling the Buffer Pool
DesiredDesired
TrivialTrivial
• When is buffer written back to disk?• Steal/No-steal
Can it be written before commit? (steal)Or does it have to wait till after commit? (no-steal)
• Force/No-forceIs it written “immediately” after commit? (force)Or can it remain in memory? (no-force)
NoStealNoSteal StealSteal
NoForceNoForce
ForceForce
139
Write- Ahead Logging Write- Ahead Logging (WAL)(WAL)
The Write- Ahead Logging Protocol:The Write- Ahead Logging Protocol: Must force the log record for an update Must force the log record for an update
before the corresponding data page gets to before the corresponding data page gets to
disk.disk.
Must write all log records for a transaction Must write all log records for a transaction
before commit .before commit .
What goes into log:What goes into log: BFIM needed for UNDO type algorithmsBFIM needed for UNDO type algorithms
AFIM needed for REDO type algorithms AFIM needed for REDO type algorithms
140
Checkpoints in the System Checkpoints in the System LogLog
Checkpoint record written in log when all updated DB Checkpoint record written in log when all updated DB buffers written out to diskbuffers written out to disk
Any committed transaction occurring before checkpoint Any committed transaction occurring before checkpoint in log can be considered permanent (won’t have to be in log can be considered permanent (won’t have to be redone after crash)redone after crash)
ActionsActions suspend execution of all transactionssuspend execution of all transactions force-write all modified buffers force-write all modified buffers tto disko disk write checkpoint entry in log and force write logwrite checkpoint entry in log and force write log resume transactionsresume transactions
Fuzzy checkpointingFuzzy checkpointing resume transactions as soon as buffers writtenresume transactions as soon as buffers written
141
142
Top Related