Institute of Southern Punjab, Multan - WordPress.com · 2017-12-06 · Academic Honesty Your work...
Transcript of Institute of Southern Punjab, Multan - WordPress.com · 2017-12-06 · Academic Honesty Your work...
Institute of Southern Punjab, Multan
Mr. Muhammad Nouman Farooq
BSC-H (Computer Science)
MS (Telecomm. and Networks)
Honors:
Magna Cumm Laude Honors Degree
Gold Medalist!
Blog Url: noumanfarooqatisp.wordpress.com
E-Mail: [email protected]
2
Classification
Assignments & Quizzes 6% (6 Marks in Total)
Project & Presentation 9% (9 Marks)
Midterm Exams 25% (25 Marks)
Final Exam 50% (50 Marks)
Total 100 %
NOTE: 10% Marks will be Automatically given by Examination Deptt. on Attendance.
Grading Policy
Academic Honesty
Your work in this class must be your own
If students are found to have collaborated excessively or to
have cheated (e.g. by copying or sharing answers during an
examination), all involved will at a minimum receive grades of
“0” for the first violation
Further violations will result in failure in the course
3
Attendance will be taken with in 15 minutes at the start of class
Students are allowed to enter into class with in 30 minutes at the starting
of class
Zero tolerance policy on attendance, discipline of class during lectures!
Assignments must be submitted on time, no late submissions
In case of copied assignment both parties will be given zero!
Projects, Presentation, Quizzes, Assignments, Class participation are very
important.
Don’t miss your Classes, Quizzes, Presentations, Assignments and
Projects!
Few Things to Remember!!
4
5
Advance Database Systems
Lecture# 1
Overview of RDBMS
Recommended Books/Manuals
Modern Database Management - 8th Edition by Jeffery A. Hoffer, Prescott and McFadden
Fundamentals of Database Systems, By R. Elmasri and S. Navathe, 6th Edition, 2011
Database Systems-A Practical Approach to Design, Implementation and Management by
Thomas Connolly and Carolyn BEGG, 4th Edition
Database Systems – Design, Implementation, and Management-Carlos, Steven and Peter
(9th Edition)
A Guide To MySQL 8th Edition by Philip J. Pratt & Mary Z. Last
A guide to Advance MySQL latest Edition by Muhammad Nouman Farooq
6
Lecture 1: Overview of RDBMS
Database Definitions and Advantages of RDBMS
Hierarchical Structure of DBMS
Introduction to File Processing Systems
Advantages of Database Approach
System Development Life Cycle (SDLC)
Database Development Process (DDP)
The Three-Level ANSI-SPARC Architecture
Relational Data Structure
Relational Keys
7
8
Jonathan Abrams Friendster fails due to
not optimized and efficient Database
System
Mark Zuckerberg Facebook did not fail
due to optimized and efficient Database
System
9
Database Definitions and Advantages
of RDBMS
10
Database Definitions
An organized collection of information in computerized format.
A collection of information organized and presented to serve a
specific purpose
11
Continued…
The term database is often incorrectly referred to as a synonym for a
“database management system (DBMS)”.
12
Continued…
Each of the above given definition are correct, and describe
database from slightly variant perspectives.
From exam point of view, anyone will do. However, within
this course, we will be referring first of the above
definitions more frequently, and concepts discussed in the
definition like, logically related data, shared collection
should be clear.
13
14
Data and Information Data is the collection of raw facts collected from any specific
environment for a specific purpose.
Data in itself does not show anything about its environment, so to
get desired types of results from the data we transform it into
information by applying certain processing on it.
Once we have processed data using different methods; than data is
converted into meaningful form and that form of the Data is called
information.
15
Continued…
16
Continued…
If we consider the data in the previous figure without the titles or the
labels associated with the data (EmpName, age, salary) then; it is not
much useful.
However, after attaching these labels it brings some meanings to us,
this meaningfulness is further increased when we associate some other
labels, like the company name and the department name etc.
So this is a very simple example of processing that we can do on the
data to make it information.
17
18
Continued…
Database Systems
The term database system is a combination of database and
the Database Management System (DBMS).
So, database is collection of meaningful data/information,
DBMS is a tool to manage this data, and both jointly are called
database system.
19
Why Use Models?
Models can be useful when we want to examine or manage part of
the real world.
The costs of using a model are often considerably lower than the
costs of using or experimenting with the real world itself.
Examples:
Network Model Simulation for an Organization in Packet Tracer
World Map
ERD for Relational Database Systems
20
Hierarchical Structure of DBMS
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Flat File Vs Relational Database
System
39
Flat File Vs Relational Database
System
A flat file database is a database designed around a single table.
The flat file design puts all database information in one table, or
list, with fields to represent all parameters.
A flat file may contain many fields, often, with duplicate data that
are prone to data corruption.
40
Continued…
If you decide to merge data between two flat files, you need to
copy and paste relevant information from one file to the other.
There is no automation between flat files.
If you have two or more flat files that contain client addresses, for
example, and a client moved, you would have to manually modify
the address parameters in each file that contains that client’s
information.
41
Relational Database System
A relational database, on the other hand, incorporates multiple
tables with methods for the tables to work together.
The relationships between table data can be ordered, merged
and displayed in database forms.
These tables are to be normalized up to 3rd Normal Form (3-NF).
42
Introduction to File Processing Systems
43
File Processing System
44
File Processing System
45
File Processing System
46
Continued…
47
Advantages of Database Approach
48
Database System Environment
It will be helpful to reiterate our database definition here, that
is, “Database is a shared collection of logically related data,
designed to meet the information needs of multiple users in an
organization”
A typical database system environment is shown in the figure on
next slide.
49
Continued…
50
Continued…
The figure shows different subsystem or applications in an
educational institution, like library system, examination system,
and registration system. There are separate, different application
programs for every application or subsystem.
However, the data for all applications is stored at the same place
in the database and all application programs, relevant data and
users are being managed by the DBMS.
51
The data for different applications or subsystems is placed at the
same place. This introduces the major benefit of data sharing. That
is, data that is common among different applications need not to be
stored repeatedly, as was the case in the file processing
environment.
For example, all three systems of an educational institution shown
in figure need to store the data about students.
52
1-Data Sharing
Now the data like registration number, name, address, father
name that is common among different applications is being
stored repeatedly in the file processing system environment,
where as it is being stored just once in database system
environment and is being shared by all applications.
53
Continued…
Data and programs are independent of each other, so change is
one has no or minimum effect on other
Means that we do not need to duplicate data
unnecessarily; we do duplicate data in the databases,
however, this duplication is deliberate and controlled.
54
2-Data Independence
3-Controlled Redundancy
Very important feature; means the validity of the data being entered in
the database. Since the data is being placed at a central place and
being managed by the DBMS, so it provides a very helpful to check or
ensure that the data being entered into the database is actually valid.
We can also define it as Level of Correctness of Data.
55
4-Data Integrity:
5-Better Data Security
All application programs access data through DBMS, So
DBMS can very efficiently check that which user is
performing which action and accessing which part of
data.
So, A DBMS is the most effectively control and maintain
security of Data stored in a database.
56
6-Faster Development of new Applications
The database environment allows us faster application
development because of its many reasons.
As we know that database is designed focusing the
conclude result to meet the expectations of the client.
57
7-Better Concurrency Control
Concurrency means the access of database form as number of pointssimultaneously.
Concurrency Control means to access the database in such a waythat all the data accesses are completed correctly andtransparently.
One example of controlled concurrency is the use of ATM Machinefor withdrawal of money (cash).
All ATM machines of a bank are interconnected to a centraldatabase system worldwide, so that a user can access its accountfrom anywhere in the world and can get cash from any ATMterminal.
58
Continued…
As there are thousands of ATM terminal across the world for a
specific bank so as a result thousands of user process and access
the bank’s database.
All this process is managed concurrently using the database
systems and is done in such an efficient manner that no two user
face any delay in the processing of their requests.
59
8-Better Backup and Recovery Facility
Some time happens that a database which was in use and very
important transactions were made after the last backup was made,
all of a sudden due to any disastrous situation the database
crashes (improper shutdown, invalid disk access, etc.)
Now, In such a situation the database management system should
be able to recover the database to a consistent state so that the
transactions made after the last backup are not lost.
60
System Development Life Cycle (SDLC)
61
62
63
Database Development Process (DDP)
64
Database Development Process
Database Application development Process includes theFollowing Stages or steps:
1. Preliminary Study
2. Requirement Analysis
3. Database Design
4. Physical Design
5. Implementation
6. Maintenance
65
Continued…
66
Preliminary Study
Design of database is carried out in a number of steps; these steps
play important role in the design process and need to be given
proper attention.
First Phase of the database development process is the Preliminary
Study, which is based on the proper study of the system.
It means that all the parts of the systems, or the section of the
subject organization for which we intend to develop the system
must be studied.
67
Continued…
We should find the relation or interaction of different section of the
organization with each other and should understand the way
information flows between different sections of the organization.
Moreover it should also be made clear that what processing is
performed at each stage of the system.
68
Requirement Analysis Once we have investigated the organization for its different sections
and the way data flows between those sections; Detailed study of the
system is started to find out the requirements of each section.
In second phase, The detailed study of the system and its
functionality decisions made at this stage decide the overall activity
of the organization.
Requirements of one section of the organization are fulfilled in such a
way that all the sections in the organization are supporting each
other.
For example we can say that the results produced by the processing
taking place at one section are used as input for another section.69
Database Design
Third phase in the database development process is the database
design.
This is a technical phase of the process and need handsome skill as
a Database Administrator. This is the phase where the conceptual
design of the database is created. It also known as Conceptual Data
Modeling.
Entities are identified and given attributes, relationships are built
and different types of relationship are performed by assigning
cardinalities.70
Physical Design
In fourth phase, The choice of the DBMS is made on the basis of
requirements and the environment in which the system will operate.
Implementing a database on a specific DBMS is very important
because it involves the major financial investment of the
organization, and can not be reverted in case a selected DBMS in not
capable of providing the desired efficiency.
Also, Assigning meaningful Data Types, Constraints/Limitations and
De-Normalization of Data where required by the real time
scenario/system are to be finalized and processed in this phase.
71
Implementation
This phase is specific to writing the application programs needed to
carry out different activities according to user requirements.
We must write application programs in this phase by using MySQL
(mai-structured query language) by using DBMS engine known as
XAMPP (Cross platform apache MySequel PHP Perl).
We can done coding with out any logical error; If we have
implemented all previous phases of database development process
correctly.
72
Maintenance
Maintenance means to fine tune the system and check that the
designed applications systems are fulfilling the purpose for which
they are meant.
Also, this phase may involve designing any new application for the
enhancement of the system.
Or an already working application may need to be updated or
modified to remove any errors or to add some functionality in the
system.
73
Continued…
All these stages are necessary and must be given the necessary
attention at each level to get properly working and good system design
and a better working environment.
74
The Three-Level ANSI-SPARC
Architecture
75
Three Level ANSI-SPARC Architecture
The levels form a three-level architecture comprising an External,
Conceptual, and an Internal level.
The way users observe the data is called the External Level (Front
End). The way the DBMS and the Operating System perceive/stores
the data (Back End) is the Internal Level.
The Conceptual Level provides both the mapping and the desired
independence between the external and internal levels.
76
Continued…
77
Continued…
78
Continued…
79
The overall Description/Skeleton structure of the database is called the
database schema.
At the highest level, we have multiple external schemas (also called
subschemas) that correspond to different views of the data.
At the conceptual level, we have the conceptual schema, which describes
all the entities, attributes, and relationships together with integrity
constraints.
At the lowest level, we have the internal schema, which is a complete
description of the internal model, containing the definitions of stored
records, the methods of representation, the data fields, and the indexes
and storage structures used. There is only one conceptual schema and one
internal schema per database.
Continued…
80
Continued…
The objective of the three-level architecture is to separate each
user’s view of the database from the way the database is physically
represented. There are several reasons why this separation is
desirable:-
1. Each user should be able to access the same data, but have a
different customized view of the data.
2. Users should not have to deal directly with physical database
storage details.
81
Continued…
4. The internal structure of the database should be unaffected by
changes to the physical aspects of storage, such as the changeover
to a new storage device.
5. The Database Administrator (DBA) should be able to change the
conceptual and database storage structures without affecting the
users’ views.
82
Continued…
A major objective for the three-level architecture is to provide data
independence, which means that upper levels are unaffected by
changes to lower levels.
There are two kinds of data independence: -
1. Logical Data Independence.
2. Physical Data Independence.
83
Continued…
84
Continued…
1. Logical Data Independence: -
Changes to the conceptual schema, such as the
addition or removal of new entities, attributes, or relationships,
should be possible without having to change existing external schemas
or having to rewrite application programs. Clearly, the users for whom
the changes have been made need to be aware of them, but what is
important is that other users should not be.
85
Continued…
2. Physical Data Independence: -
Changes to the internal schema, such as using
different file storage structures, using different storage devices should
be possible without having to change the conceptual or external
schemas.
86
Relational Data Structure
87
Relational Data Structure
88
89
90
Relational Keys
91
Relational Keys
Keys are used to create relationship among different database tables.
An entity type may have many instances, from a few to severalthousands and even more.
Now out of many instances, when and if we want to pick aparticular/single instance, and many times we do need it, then key isthe solution.
For example, think of whole population of Pakistan, the data of allPakistanis lying at one place, say with NADRA people. Now if atsometime we need to identify a particular person out of all this data,how can we do that?
92
Continued…
While defining an entity; we also generally define the key of that
entity. How do we select the key, from the study of the real-world
system; key attribute(s) already exist there, sometimes they don’t
automatically enabled; then the designer has to define one.
A key can be simple, that is, consisting of single attribute, or it
could be composite which consists of two or more attributes.
93
94
Super Key
Candidate Key
A super key for which no subset is a super key is called a candidate
key, or the minimal super key is the candidate key.
It means that there are two conditions for the candidate key, One; It
identifies the entity instances uniquely, as is required in case of
super key, Second; It should be minimum, that is, no proper subset
of candidate key is a key.
95
Continued…
So, If we have a simple super key, that is, that consists of single/simple
attribute, it is definitely a candidate key, 100%.
However, if we have a composite super key and if we take any attribute out
of it and remaining part is not a super key anymore then that composite
super key is also a candidate key since it is minimal super key.
For example, one of the super keys that we identified from the entity
STUDENT is “regNo, name”, this super key is not a candidate key, since if
we remove the regNo attribute from this combination, name attribute
alone is not able to identify the entity instances uniquely.
96
Primary Key
A candidate key chosen by the database designer to act as key is the
primary key.
An entity type may have more than one candidate keys, in that case
the database designer has to designate one of them as primary key,
since there is always only a single primary key in an entity type.
If there is just one candidate key then obviously the same will be
declared as primary key. The primary key can also be defined as the
successful candidate key.
97
Continued…
The relation that holds between super and candidate keys also
holds between candidate and primary keys, that is, every primary
key (PK) is a candidate key and every candidate key is a super key.
A certain value that may be associated with any attribute is NULL,
that means “not given” or “not defined”.
A major characteristic of the Priamary Key is that it cannot have
the NULL value.
98
Unique Key
A candidate key which can return a Record uniquely but may store a
NULL value is called as Unique Key.
stdContactNumber attribute in STUDENT table is known as Unique
key.
99
Alternate Key
Candidate keys which are not chosen as the primary key are known
as alternate keys.
For example, we have two candidate keys of EMPLOYEE in figure 2,
regNo and nIdNumber, if we select regNo as PK then the nIdNumber
will be alternate key.
100
Foreign Key
Sometimes the information stored in a relation is linked to the
information stored in another relation.
If one of the relations is modified, the other must be checked, and
perhaps modified, to keep the data consistent.
101
Continued…
Suppose that in addition to Students, we have a second relation:
Enrolled (cId: string, sId: string, cGrade: Text)
The sId field of Enrolled is called a foreign key and refers to
Students.
The foreign key in the referencing relation (Enrolled, in our example)
must match the primary key of the referenced relation (Students),
i.e., it must have the same number of columns and compatible data
types, although the column names can be different.
102
Continued…
103
Continued…
As the figure shows, there may well be some students who are not
referenced from Enrolled
(e.g., the student with sId=50000)
However, every sId value that appears in the instance of the Enrolled
table appears in the primary key column of a row in the Students
table.
104
Continued…
If we try to insert the tuple (55555, Art104, A) into E1, the rule is
violated because there is no tuple in S1 with the id 55555; the
database system should reject such an insertion.
Similarly, if we delete the tuple (53666, Jones, jones@cs, 18, 3.4)
from S1, we violate the foreign key constraint because the tuple
(53666, History105, B) in E1 contains sid value 53666, the sid of the
deleted Students tuple.
The DBMS should disallow the deletion or, perhaps, also delete the
Enrolled tuple that refers to the deleted Students tuple.105
Continued…
106
Secondary Key
Many times we need to access certain instances of an entity type
using the values of one or more attributes other than the PK.
The difference in accessing instances using the value of a key or
non-key attribute is that the search on the value of PK will always
return a single instance (if it exists), where as uniqueness is not
guaranteed in case of non-key attribute.
107
Continued…
Such attributes on which we need to access the instances of an
entity type that may not necessarily return unique instance is called
the secondary key.
For example, we want to see how many of our students belong to
Multan, in that case we will access those instances of the STUDENT
entity type that contain “Multan” in their address.
108
Continued…
In this case address will be called secondary key, since we are
accessing instances on the basis of its value, and there is no
compulsion that we will get a single instance.
Keep one thing in mind here, that a particular access on the value of
a secondary key MAY return a single instance, but that will be
considered as chance.
109
Continued…
There is not the compulsion or it is not necessary for secondary key
to return unique instance
But; In case of super, candidate, primary and alternate keys it is
compulsion that they will always return unique instance against a
particular value.
110
Surrogate Key:
A Surrogate Key is any column or set of
columns that can be declared as the primary key instead of more
than two composite Primary keys that jointly makes a Cumbersome
key (CUMBERSOME meaning: Large Set).
Example of Cumbersome key and Surrogate key
is shown in next slide.
Continued…
112
b) Mapping the ternary relationship PATIENT TREATMENT
Remember
that the
primary key
MUST be
unique
This is why
treatment date
and time are
included in the
composite
primary key
But this makes a
very
Cumbersome
Key…
It would be better to create a
Surrogate Key like treatmentId
in PATIENT_TREATMENT table
Quick Review of Relational Keys Concept’s
113
Every Candidate Key must be a Super Key. But, Every SuperKey may not be a Candidate Key.
Every Primary Key must be a Candidate Key. But, EveryCandidate key may not be a Primary Key.
Every Alternate Key must be a Candidate Key. But, EveryCandidate key may not be a Alternate Key.
Every Foreign Key must be a Primary Key.
Every Secondary Key must not be a Candidate Key & PrimaryKey & Alternate Key.
Cumbersome Key (Group of Composite Primary Keys) can bereplaced by Surrogate Key
Class Activity
114
Class Activity (Solution)
115
Recommended Readings
Chapter 1 from: -
Database Systems-A Practical Approach to Design,
Implementation and Management by Thomas Connolly and
Carolyn BEGG, 4th Edition (Page No. 54-67)
Modern Database Management - 8th Edition by Jeffery A.
Hoffer, Prescott and McFadden (Page No. 34-62)
116
Continued…
Chapter 2 from: -
Modern Database Management – 8th Edition by Jeffery A. Hoffer,
Prescott and McFadden (Page No. 84-103)
Database Systems-A Practical Approach to Design, Implementation
and Management by Thomas Connolly and Carolyn BEGG, 4th Edition
(Page No. 69-114)
Fundamentals of Database Systems – 6th Edition by Elmasri and
Navathe (Page No. 58-83)
117
Chapter 5 from:
Modern Database Management – 8th Edition by Jeffery A. Hoffer,
Prescott and McFadden (Page No. 220-230)
Chapter 3 from:
Database Systems-A Practical Approach to Design,
Implementation and Management by Thomas Connolly and
Carolyn BEGG, 4th Edition (Page No. 120-138)
118
Continued…
Summary of Lecture
119
Lecture 1➦
Database Definitions and Advantages of RDBMS
Hierarchical Structure of DBMS
Introduction to File Processing Systems
Advantages of Database Approach
Summary of Lecture (Continued..)
120
Lecture 1➦
System Development Life Cycle (SDLC)
Database Development Process (DDP)
The Three-Level ANSI-SPARC Architecture
Relational Data Structure
Relational Keys
END OF LECTURE 1
121