Chapter 2

24
1 Chapter 2 Database System Architecture

description

Chapter 2. Database System Architecture. 2.1. Data Models, Schemas and Instances. Data Model : A set of concepts to describe the structure of a database, and certain constraints that the database should obey. - PowerPoint PPT Presentation

Transcript of Chapter 2

Page 1: Chapter 2

1

Chapter 2

Database System Architecture

Page 2: Chapter 2

2

2.1. Data Models, Schemas and Instances

• Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey.

• Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database.

• Schema Diagram: A diagrammatic display of (some aspects of) a database schema.

• Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE.

• Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence).

Page 3: Chapter 2

3

Database Schema Vs. Database State

• Database State: Refers to the content of a database at a moment in time.

• Initial Database State: Refers to the database when it is loaded

• Valid State: A state that satisfies the structure and constraints of the database.

• Distinction• The database schema changes very infrequently. The

database state changes every time the database is updated. • Schema is also called intension, whereas state is called

extension.

Page 4: Chapter 2

4

2.2. Overview of Data Models

• The main purpose of Data Model is to represent the data in an understandable way

Page 5: Chapter 2

5

Categories of data models

• Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.)

• Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer.

• Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details.

Page 6: Chapter 2

6

Categories of data models …

• Representational data models represent data by using record structures and hence called record based data models

• These include: hierarchical data model, network data model and relational data model– Hierarchical Model

• The simplest data model • Record type is referred to as node or segment • The top node is the root node • Nodes are arranged in a hierarchical structure as

sort of upside-down tree • A parent node can have more than one child node

Page 7: Chapter 2

Categories of data models …

• A child node can only have one parent node • The relationship between parent and child is one-

to-many • Relation is established by creating physical link

between stored records (each is stored with a predefined access path to other records)

•  To add new record type or relationship, the database must be redefined and then stored in a new form.

7

Page 8: Chapter 2

Categories of data models …

• ADVANTAGES of Hierarchical Data Model:–  Hierarchical Model is simple to construct and operate on

– Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in manufacturing, personnel organization in companies

– Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc.

• DISADVANTAGES of Hierarchical Data Model:– Navigational and procedural nature of processing

– Database is visualized as a linear arrangement of records

– Little scope for "query optimization"

8

Page 9: Chapter 2

Categories of data models …

• Network Model – Allows record types to have more that one parent

unlike hierarchical model – A network data models sees records as set

members – Each set has an owner and one or more members – Allow no many to many relationship between

entities – Like hierarchical model network model is a

collection of physically linked records. – Allow member records to have more than one

owner

Page 10: Chapter 2

Categories of data models …

– ADVANTAGES of Network Data Model:• Network Model is able to model complex relationships and

represents semantics of add/delete on the relationships. • Can handle most situations for modeling using record types

and relationship types. • Language is navigational; uses constructs like FIND, FIND

member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database.

– DISADVANTAGES of Network Data Model:• Navigational and procedural nature of processing • Database contains a complex array of pointers that thread

through a set of records. • Little scope for automated "query optimization”

Page 11: Chapter 2

Categories of data models …• Relational Data Model

– Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational Model for Large Shared Data Banks')

– Terminologies originates from the branch of mathematics called set theory and relation

– Can define more flexible and complex relationship – Viewed as a collection of tables called “Relations” equivalent

to collection of record types – Relation: Two dimensional table – Stores information or data in the form of tables rows and

columns – A row of the table is called tuple equivalent to record – A column of a table is called attribute equivalent to fields – Data value is the value of the Attribute – Records are related by the data stored jointly in the fields of

records in two tables or files. The related tables contain information that creates the relation

Page 12: Chapter 2

Categories of data models …

• The tables seem to be independent but are related some how.

• No physical consideration of the storage is required by the user

• Many tables are merged together to come up with a new virtual view of the relationship

Alternative terminologies

Relation Table File

Tuple Row Record

Attribute Column Field

Page 13: Chapter 2

Categories of data models …• The rows represent records (collections of information about

separate items)• The columns represent fields (particular attributes of a record) • Conducts searches by using data in specified columns of one table

to find additional data in another table • In conducting searches, a relational database matches information

from a field in one table with information in a corresponding field of another table to produce a third table that combines requested data from both tables

Page 14: Chapter 2

14

2.3. Architecture and Data Independence

• Defines DBMS schemas at three levels:• Internal schema at the internal level to describe

physical storage structures and access paths. Typically uses a physical data model.

• Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model.

• External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level.

Page 15: Chapter 2

Architecture and Data Independence …

Three-level ANSI-SPARC Architecture of a Database

Page 16: Chapter 2

16

Architecture and Data Independence …

Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution.

Page 17: Chapter 2

17

Architecture and Data Independence …

Data Independence• Logical Data Independence: The

capacity to change the conceptual schema without having to change the external schemas and their application programs.

• Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.

Page 18: Chapter 2

18

Architecture and Data Independence …

Data Independence

When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas.

Page 19: Chapter 2

2.4. Database Languages and Interfaces

• A variety of users are supported by DBMS• The DBMS must provide appropriate languages

and interfaces for each category of users

Database Languages  – Data Definition Language (DDL) 

• Allows DBA or user to describe and name entities, attributes and relationships required for the application.

• Specification notation for defining the database schema

19

Page 20: Chapter 2

Database Languages and Interfaces …

– Data Manipulation Language (DML)•  Provides basic data manipulation operations on data held in

the database. • Language for accessing and manipulating the data organized

by the appropriate data model • DML also known as query language • There are two main types of DMLs

– Procedural DML: user specifies what data is required and how to get the data.

– Non-Procedural DML: user specifies what data is required but not how it is to be retrieved

 

20

Page 21: Chapter 2

Database Languages and Interfaces …

DBMS Interfaces• User friendly interfaces provided by DBMSs include:

– Menu-Based Interfaces for Web Clients or Browsing– Forms-Based Interfaces– Graphical User Interfaces– Natural Language Interfaces– Speech Input and Output– Interfaces for Parametric Users.– Interfaces for the DBA

21

Page 22: Chapter 2

2222

• Properties of Relational Databases– Each row of a table is uniquely identified by a PRIMARY

KEY composed of one or more columns– Each tuple in a relation must be unique – ENTITY INTEGRITY RULE of the model states that no

component of the primary key may contain a NULL value. – A column or combination of columns that matches the

primary key of another table is called a FOREIGN KEY used to cross-reference tables

– REFERENTIAL INTEGRITY RULE of the model states that, for every foreign key value in a table there must be a corresponding primary key value in another table in the database or it should be NULL

Page 23: Chapter 2

2323

– All tables are LOGICAL ENTITIES – A table is either a BASE TABLE (Named Relation) or

VIEW (Unnamed Relation) – Only Base Tables physically store data – VIEWS are derived from BASE TABLES with SQL

instructions like: [SELECT .. FROM .. WHERE .. ORDER BY]

– Order of rows and columns is immaterial – Entries with repeating groups are said to be un-normalized – Entries are single-valued – Each column (field or attribute) has a distinct name – All values in a column represent the same attribute and

have the same data format

Page 24: Chapter 2

2424

• The building blocks of the relational data model are:

– Entities: real world physical or logical object

– Attributes: properties used to describe each entity or real world object.

– Relationships: the associations between entities

– Constraints: rules that should be obeyed while manipulating the data.