A concept of dbms

44
WRITTEN BY SOURAV MISHRA A concept of dbms

description

i want to give u form my experience

Transcript of A concept of dbms

Page 1: A concept of dbms

WRIT TEN BY SOURAV

MISHRA

A concept of dbms

Page 2: A concept of dbms

Unit :1 File management system. Hash addressing. Problem of sharing. Multiple indexing. Redundancy.

Page 3: A concept of dbms

FILE ORGANIZATION What is file management system ? The technique is used to represent and store a record on file is called file organization. This is three types sequential file organization. Indexed sequential organization. Direct access. Fundamental characteristics of file management system : Creation of file : to create a file. Updating :It includes insertion ,deletion ,updation. Retrieval :retrieval means access the file. It have two way Inquiry. Report generation. Mentainance :It includes restructuring and reorganizing .restructuring means structural

change are made to file.re organizing means changes are made from one file organization to another.

Page 4: A concept of dbms

Types of file organization Sequential file organization: in sequential file organization records are arranged in either

ascending or descending order. Advantage: the advantage of sequential file is the ability to access the nest record

quickly. Disadvantage: in sequential file when we access a record from file organization at that

time the key value search the whole record. Index sequence file: In index sequential file organization ,to access the record in

individually and sequentially by same key value. Advantage: in index sequential file index provide for random access for record. Disadvantage: it is more expensive for reorganizing the records in overflow area. Direct file organization :in this organization the mapping from search key value is

mapped directly to the storage location. Advantage: the advantage of relative file is the ability to access the indivual record

directly.

Page 5: A concept of dbms

What is hashing and hash function? Hashing is a searching technique to find the location of desire record present in the

hash table, depend only the given key. Hash table is a data structure where we store a key value after applying the hash

function. A hash table divided into no of bucket and each bucket is capable of storing a no of

record. Each bucket is divided into no of slot and each slot is capable of storing one record. The basic idea of hashing is the transformation of a key into the location in a table.

This work is done by hash function. There are three types of hash function • Division reminder method.• Mid square method.• Folding method.

Page 6: A concept of dbms

Use of Hashing and hash function

Keys Buckets overflow areaSm 000 001 003G.m 051 J.P 052 053R.KKeys Hash function HashingS.M 00G.m 01Rk 02Jp .. 04

* R.K 9932*

G.m 9932 .S.m 8935 .

*J.p 5562 .

*

Page 7: A concept of dbms

Comparision of Hash function The division remainder technique given the best overall performance .

It is the best hush function . The midsquare method can be applied to a file with low loading factor . So it given poor performance .

Redundancy : Data redundancy occurs in database system which have a field that is repeated in two or more tables. Due to this other data or field can not be inserted into database properly. To maintain this reduction of redundancy is very much essential for every database design.

Data redundancy leads to data anomalies and corruption and generally should be avoid by design. Database normalizes prevent redundancy and remove the anomalies data.

Page 8: A concept of dbms

Hash addressing Hash addressing: In direct file organization the key value is mapped

directly to the storage location.

Key value Address.

Advantage : Hash addressing depend on hash function. It depend upon 1. The distribution of key values directly map to location of the table.2. The collision resolution technique must be used. Disadvantage : the main disadvantage is collision. A collision occur

when two distinct key values are mapped to the same storage location. Collision is resolved by linear probing and double hashing.

Hash function

Page 9: A concept of dbms

Linear probing and Double hashing Approach to problem of collision: When a hash function mapped to the large

key value to a small address .there are certain to be collision. More than one key value will be mapped to a single location .Due to this a collision is occurred and this solved by linear and double hashing.

Linear probing: The process of finding the slot in hash table is called probing. In this method key value transfer home address to empty location.

It uses the following hash function. h(k,i)=[h`(k)+i]mod m where h`(k) =k mod m and i=1,2,… Double hashing: double hashing is a computer technique used in a hash

table to resolved hash collision. It uses following hash function h(k,i)= [h1 (k)+ih2(k) ] mod m where h1 (k) =k mod m and h2(k) =k mod m`

Page 10: A concept of dbms

Problem of sharing file & Multiple indexing

In a database the sharing of data create problem when accessing the data more than one user. At that time the authorized user can update data without taking the permission of DBA whenever the particular updated file accessed by other authorized user facing a difficulty to update.

Multiple indexing:

Page 11: A concept of dbms

Unit: 2Database and DBMS.Database schema and 3D schema.Data abstraction and data independence.DBMS Language.Database user.Data model.Advantage Disadvantage of DBMS.E.R model.DBMS architecture and data dictionary.

Page 12: A concept of dbms

Database and DBMS Concept of database: Database is a collection of related data. A database is logically coherent collection of data with inherent meaning. A database is designed populated with data for specific purpose. A database may be generated & maintain manually or it may be computerised.Ex: Library card catalog is a database that may be created &maintain manually.Uses: the database use to store information ,useful to an organization. DBMS: Database management system is a mega software system that allows access to data contain in a database. It allow user to maintaining, managing, utilising database . It facilitate process of defining, constructing and manipulating data. Database schema: Description of database is known as database schema which is specified during database designing. Database state or instances: The data in database at a particular moment in timing is called a database state.

Page 13: A concept of dbms

3-schema architecture Internal schema: The internal schema describes the internal level of database. It describes the physical storage of database. Conceptual schema: It describe conceptual level which includes structure of whole database for community of user. External schema: It describes the external level such as user view.

User 1 User 2 User 3

Conceptual schema

Internal schema

Page 14: A concept of dbms

Data Abstraction and Data Independent Data abstraction :1. One fundamental characteristic of the database approach.2. Data abstraction hides certain details of how data is stored ,created and maintained. There are several level of abstraction: Physical level: It defines how data is stored. Conceptual level: It describes what data is stored. View level : It represents different view of data. Data independence: The concept of data independence can be defined as the capacity to change the schema

at one level of database system without having to change at next higher level. There are two level of data independence. Logical data Independence: It is the capacity to change the conceptual schema without having to change the external schema or

application programme. We may change conceptual schema to expand the database by adding record or fields and reduce the

database by removing record or field. Physical data independence: It is the capacity to change the internal schema without having to change the

conceptual schema.

Page 15: A concept of dbms

DBMS languages DBMS have different language to describe the database.1. Extended host languages: It is a system provide extension to cable to enable the user to interact with

database. 2. Query languages: It provides more powerful facilities to interact with database. It again divided into

two types.a) Data definition language.(DDL)b) Data manipulation language.(DML)DDL: DBMS provides a languages called data definition language which can be used to define the

conceptual schema and also gives details about storage of data in physical device.DML: DML involve the following task1. Retrieve the data from the database.2. Insertion the new data in database .3. Deletion and modification of existing the data. There are two types of DML .A. High-level DML: high level DML such as SQL (standard query language ) can specify and retrieve

many records in a single DML statement.B. Low-level DML : Low level DML specify how to retrieve the data.

Page 16: A concept of dbms

Database user 1. Actor on the screen: Many person are involved in design, use and maintenance of database.

The people whose job involve day to day use of large database, these type of user known as actor on the screen.

The actor are Database administer: The user who can control the centralized database system is called DBA.A. In any organization where many person use same resources ,there is a need for a chief

administrator to oversee & manage these resources. B. In a database ,the primary resources is database itself & secondary resource is DBMS and

related software. All these resources are responsibility of DBA. Database designer :It is responsibility of database designer to communicate with all prospective

database users in order to understand their requirements & to come up with a design meets these requirements.

End user: End user are the people whose jobs require access to the database for querying, updating, & generating reports. The four type of end user are casual end user, parametric end user, sophisticated end user, stand alone user.

Application programmer: They are the user who are responsible for writing application programme in programming language(c, c++,java etc).

Page 17: A concept of dbms

Database user & Data model2.Worker behind the screen: Some person are associated with the design,

development & operation of DBMS software &system environment. These person are typically not interested in database itself. These person are known as worker behind the screen.

DBMS designer & implementers: They are the person who design & implement the DBMS modules & interface as a software package.

Tool developer: Tool means software system that facilitate database system design & use. The person who design & implement tools is known as tool developer.

Operators maintains person: They are the system administrator person, who are responsible for the actual running & maintenance of hardware & software environment for the database.

Data model: A data model is a collection of concept that can be used to describe the structure of a database. There are four type of data model. File based system, Traditional data model, Semantic data model or high level data model, low level data model.

Page 18: A concept of dbms

Advantage and disadvantage of DBMS Reduction of redundancies: The main advantage of DBMS is

avoiding duplication of data. Shared data: Database allow the sharing of data under its

control by any no of application programmers or user. Data independence: Data independence is advantages in

database environment since it allow for changes at one level of database without affecting other levels.

Security:

Page 19: A concept of dbms

E.R model(Entity relationship modelThe E-R model consist of the following component. Entity :An entity is a class of person, places, object, event that exist in real world. Attribute: Each attribute can have no of characteristics. The characteristics of

an entity are called attribute. For ex: name, roll no. Simple vs. composite: The attribute which can be divided into smaller,

independent, meaningful attributes are called composite attribute. ex: Address is a composite attribute. Age of a person is simple attribute.

Single value vs. multivalve: Most attribute have a single value for a particular entity such attribute are called single value attribute. Ex: age of a person. Dual color car contain multiple value.

Address

Street City

Page 20: A concept of dbms

E.R Model Stored attribute vs. Derive attribute: For a particular person , age can be

determine from current date and value of DOB. So age is derive attribute and DOB is called stored attribute.

Null attribute: The attribute having null value is called null value attribute. For ex: phone no of a person may be unknown.

Relationship: 1:1,1:n,m;n :This relationship exist among the entities.

Key attribute: The key attribute is an attribute that unique identify a entity set. Ex emp-code can identify the entity set employee.

Department

HOD

Father

Children

Customer

Item

Page 21: A concept of dbms

Symbol of E.R Model Symbol Meaning ENTITY ATTRIBUTE WEAK ENTITY

RELATIONSHIP ENTITY KEY ATTRIBUTE_____

Page 22: A concept of dbms

DBMS Architecture and Data dictionary DBMS Architecture: Different abstraction level: Database describe by three abstract level.A. Internal schema.(physical database)B. Conceptual schema.(conceptual database)C. External schema.(view) Objectives: A. Support of multiple user view.(meta data)B. Use of schema to store DB description. Data dictionary: Data dictionary also known as system catalog. It contain all the

information about the database structure that means it also describes all the primary structure of a database and these information are known as metadata.

Page 23: A concept of dbms

Unit -3

Hierarchical data model.DML.Network data model.

Page 24: A concept of dbms

Hierarchical data model Hierarchical data model is used the tree concept to represent data and relationship among data. But

no clear document are there to describe HDM. Only IMS information management system from which HDM is driven. IMS is HDBMS used in banking sector, privet firm that managed the DBMS from HDM.

Relationship : The relationship is two type Record. PCR. Record :A record is a collection of field. A record type is collection of similar record. PCR :A PCR type is a 1:n relationship among two record. One side of record is parent record type and n

side of record type is known as child record type. An occurrence of PCR type consist of 1:n relationship between parent and child record type. A hierarchical database schema is a collection of hierarchical schema. A diagrammatically

representation of hierarchical schema is known as hierarchical diagram. In a hierarchical diagram one single parent record have more than one child record type then link representation PCR type are connected.

Department Employee Project

Page 25: A concept of dbms

Characteristics of HDM1. Each HDM diagramed can have only one record and this

record does not have parent record.2. One parent record may have more than one child

record type.3. The record type does not have any child record type is

called leaf.4. All record type except root must be connected to a PCR

type.5. When one parent record type have more than one child

record type in that case child record must be ordered.

Page 26: A concept of dbms

Explanation of relationship 1:1 As in PCR one parent record type corespond to n child record where n >=0 ,so 1:1

relationship can be represented with the general concept of HDM. Similarly 1:n relationship can be represented.

When two record type have M:N relationship with each other. In that case in a PCR type concept is not sufficient enough to represent M:N relationship. This problem can be solve by storing child record multiple times. So one problem arises of storing the same record multiple time.

To solve this problem HDM assume one of parent record type as parent and raster virtual parent type bring a new concept of virtual PCR

1:1 M:N M:M

Department

Manager

Department

Employee

Employee

Project

Page 27: A concept of dbms

Virtual PCR(VPCR) VPCR is conceptually same to a PCR but they differ in the way of

their implementation's. PCR type is implemented by hierarchical sequence but a VPCR is implemented by a logical pointer to a key or physical pointer to address from a child record type to parent record type.

What is pointer?The pointer can be represented in two way one is 1. Logical pointer.2. Physical pointer.The logical pointer points to a pointer from child record type to a

key to a parent record type & the physical pointer points to a pointer from child record to an address to parent record type.

Page 28: A concept of dbms

INTEGEITY CONSTANT These are constant on database such that database must obey these constant

Any record type can not be exist without being related to a parent record type. It has three implementation.

Whenever a parent record type is deleted .Then its corresponding child record type is also deleted.

Whenever a child record is inserted then its corresponding parent record type also linked

.Whenever a virtual parent is deleted then its corresponding parent record is not deleted.

• Any record type that have more than one parent record type can exist only one record type as real parent & virtual parent.

• One virtual record type may have any no of child record type but IMS restrict to this only one .

Page 29: A concept of dbms

Explanation of root segment, dependent segment &links

Typically a hierarchical schema by means of schema diagram forms a tree data structure. For ex the above diagram can be represented by tree structure.

Each node represent a record type. Link is representation of a PCR type in a hierarchical schema diagram .

In hierarchical data model except root all child record type are dependent segment. Inertly we say that in every PCR type the child record type depend upon the parent record type by the root of PCR . So root is full independent segment & leaf is full dependent segment.

Dept

D loc D Employee D ManagerProject

P Worker

Page 30: A concept of dbms

Networking data model It also known as DBTG (database task group) as it also proposed by codasyl. The network data model is based on the set construct and record type.1. Record: A record is a collection of field. The similar collection of record is

called is record type.2. Set construct: The set construct defines 1:n relationship between two

record type. The record type is one side is known as owner record type and at n side is known as member record type.

3. Batchman diagram: In batchman a set type has three parts.a) Owner record type. owner b) Member record type. Dept-studentc) Name of the set type. Member

Dept

Student

Page 31: A concept of dbms

m Network data model construct are two type. Structural and Behavioral. Behavioral construct are of two main category. Insertion and retention option. Insertion option deals with roots applied to a member record type when a record is

inserted. Retention option gives how a record can behave when inserted, deleted or updated. Insertion option: A new record can be inserted in two ways one is automatically and

other is manually. The new record when inserted is automatically associated with a set instance. This is maintain by system.

Retention option: It is three type.1. Optional: It mince a record may related to any set instance. 2. Mandatory: A record can not be exist without being related to a owner record or to

any set instance.3. Fixed: Once a record is inserted it must be owner record and it is fixed.

construct

Page 32: A concept of dbms

Unit:4

Keys and types.Integrity rule.Relational algebra.Tuple and Domain.Relational algebra.

Page 33: A concept of dbms

Keys And Types Keys: A key is that data item that exclusively identifies a record. For ex: account-no,

product-id, emp-no and customer-no are used as key fields because they specifically identifies a record stored in a database.

Super key: A super key for a set of one or more attributes which combine value uniquely identifies the entity in entity set. For ex entity set employee, the set of attributes (emp-name, address)can consider to be a super key.

Primary key: The primary key uniquely identifies each record in a table and must never be the same for records ex: emp-code can be the primary key for the entity emp.

Candidate keys: A candidate key in the minimum set of attribute to identify a record within a entity set.

Secondary key: After choosing primary key and candidate keys, the others are called secondary key.

Foreign key: it is a set of field in a relation that refers to field in another relation.

Here primary key : roll no, super key: roll name, roll sex, roll class. Secondary key: name, sex, class. Stu code is candidate key.

roll name Stu code sex class

Page 34: A concept of dbms

Integrity rule When many users enter data items into a database it becomes important that all data item

and association among such data item not destroy. Hence, data insertion, updation, etc have to be carried out in such a way that database

integrity is maintain. Integrity rule 1(entity integrity): If a attribute of a table is of prime attribute, it can not accept null value or in other words,

primary key may not be null. Integrity rule 2 (referential integrity):1. To ensure that a value that appears in one relation for given set of attributes also appears

for a certain set of attributes in another relation .such a condition is called referential integrity.

2. Integrity rule 2 is concern with the concept of foreign key.3. The value of a primary key which appears in a base table. Whenever there is a cardinality

then the value of a primary key, which becomes a foreign key in the entity relation, the value of foreign key and primary key should be same.

Page 35: A concept of dbms

Relational algebra It is a formal foundation of relational model. It is used for implementing and optimizing queries in relational database

management system. It is two type. Set oriented operation, relational oriented operation. Set oriented operation: There are four type of this operation. set union, set intersection, set difference, Cartesian product. Relational oriented operation: There are two type of operation. Select: The select operation extract specific touple from a relation. We can

use the lower Greek latter to denote selection. In general we allow Comparision using relational operators.

Project: The project operation is a unary operation. The project operation select the column from the table and discard the other column.

Page 36: A concept of dbms

There are twelve rules formulated by E.F CODD ,for RDBMS in 1970. The twelve rules are having the following main points: 1. Information Representation.2. Granted Access.3. Systematic treatment of null value. 4. Database description rule.5. Comprehensive data sub language.6. View updating.7. High level update, insert, delete.8. Physical data independence.9. Logical data independence.10. The distribution rule.11. Non sub-version.12. Integrity rule.

Codes Rule

Page 37: A concept of dbms

Tuple and domain calculus Relational calculus: Tuple and domain calculus are collectively

referred to as relational calculus. Domain calculus: As in domain calculus expression is of the form

{x :f(x)}. Where f is a formula on x and x represents a set of domain variable.

Tuple variable: In TRC the variable represent the tuple from specified relation. There are expression that is {t / con(t)} .Here t is a tuple variable and cond(t) is a conditional expression involving t.

Ex: Find all employee whose salary above 50,000. Ans: {t :employee(t) and t.salary>50,000} employee(t) specifies

that the range relation of tuple variable ‘t’ is employee. Tuple: Tuple is the rows of a table. Domain: It is the attribute or column of a table.

Page 38: A concept of dbms

Unit: 5

Anomalies.Functional dependency.Closer and axiom rule.Normalization and types.BCNF and database security.Concurrency operation.

Page 39: A concept of dbms

Anomalies & F.D Anomalies: The aim of the database system is to reduce redundancy meaning

information is to be stored only once. Storing information several times leads to the wastage of storage space and increase in the total size of the data store update to the database with such redundancy is becoming in consistence.

Functional dependency: F.D are the relationships among the set of attributes with relationship.

A F.D denoted by A B between two set of attributes A & B. There are different types of F.D. Full functional dependency: When all non-key attributes are dependent on the

key attributes is called full functional dependency. In following example non-key attribute (name, adds, age course) are depend

on key attribute roll no.

Roll no name address age course

Page 40: A concept of dbms

Functional dependency Partial dependency: In partial dependency when some non-key

attributes depends on the key attributes and the remaining non-key attributes depend on one are more non-key attributes.

Transitive dependency: When one non-key attribute depends on other non-key attribute, it is called transitive dependency.

Roll no Name Address Age Coerce Date of join

snow Origin Destination Distance

Page 41: A concept of dbms

Closer and axiom rule of F.D Multivalued dependency: Multivalued dependency are a consequence of first normal form. F.D are also referred to as a equality generating dependency and multivalued

dependency are referred to as touple generating dependency. Closer of F.D: The set of all dependency that include F as well as all dependency that

can be inferred from F. There are six rules are known as axiom rule.1. Reflexive rule. If x2. Augmenting rule.3. Transitive rule. 4. Decomposition rule.5. Union rule.6. Pseudo transitive rule.

Page 42: A concept of dbms

Normalization And Types Normalization is the process of efficiently organizing data in a database. The first step of normalization is to convert E.R model into table. Then to examine the

table for redundancy & if necessary change to non-redundancy form. The normal form are used to ensure that various type of anomalies and inconsistencies

are not introduces into the database. There are five type of normal form. 1st normal form: A relation schema is said to be in first normal form if the values of

domain of each attribute of relation are atomic. It disallow having a set of values, a touple of values or a combination of both. 2nd normal form: A relation schema is said to be second normal form, if it is in 1st normal

form and if all non prime attribute are fully functionally depend on relation key. 3rd normal form: To be in 3rd normal form, the relation must be in 2nd normal form and no

transitive dependency may exist without the relation.4th and 5th normal form based on the concept of multivalue dependency & join dependency.

Page 43: A concept of dbms

BCNF & Database Security BCNF:1. When a relation has more than one candidate key, anomalies may result

even through the relation in 3nf.2. It based on the concept of determination.3. If a table contains only one candidate key the 3nf and BCNF is

equivalent. BCNF only violated if table contain more than one candidate key.

Security: Database security are of two types .1. System security: System security deals with providing security to

database at system level. For ex :DBMS cheeks.2. Database security: It protecting the data individual level. For ex: a user

with insufficient privileges can not view a table.

Page 44: A concept of dbms

Concurrent OperationLocking and timestamp are two best concurrent operation. Locking: A data item can be locked by a transaction in order to prevent this data item

being accessed and updating by any other transaction.Locks are two types. Exclusive lock: A transaction which want to modify a data item and not read if

can make exclusive lock on the data item. Hence it is also known as write lock. Shared lock: A transaction which only read a data item and not modify it, can

make shared lock on the data item. Time stamped ordering: In this method a serial order is created among the

concurrent transaction by assigning a unique non decreasing number to each transaction.