Lec01

31
13/2/2012 DB2 Dr.Suresh Sankaranarayanan 1 DB2 Database Management Systems Introduction to Database Systems

description

 

Transcript of Lec01

  • 1. DB2Database Management Systems Introduction to Database Systems 13/2/2012 DB2Dr.Suresh Sankaranarayanan

2. Textbook Used

  • Database System Concepts by Abraham Silberschatz, 5 thEdition

13/2/2012 DB2Dr.Suresh Sankaranarayanan 3. Outline

  • Database Management Systems
  • Data Abstraction
  • Instances and Schemas
  • Data Models
  • Database Languages
  • Database Design
  • Database Architectures
  • History of Database systems

13/2/2012 DB2Dr.Suresh Sankaranarayanan 4. Data base Management System?

  • Database Management systemis a collection of interrelated data and a set of programs to access those data.
  • Data: stored representations of meaningful objects and events
    • Structured: numbers, text, dates
    • Unstructured: images, video, documents
  • Collection of data is usually referred to asdatabase
    • An environment that is bothconvenientandefficientto use

13/2/2012 DB2Dr.Suresh Sankaranarayanan 5. Database Applications

  • Database are widely used. Some of its applications are:
    • Banking: all transactions
    • Airlines: reservations, schedules
    • Universities:registration, grades
    • Credit Card Transactions: Purchases on credit cards and generation of monthly statements
    • Telecommunication: Keeping record of calls made, generating monthly bills, maintaining balances on prepaid calling, storing information about communication networks
    • Finance: Storing information about holdings, sales and purchase of financial instruments such as stocks and bonds
    • Sales: customers, products, purchases
    • Online retailers: order tracking, customized recommendations
    • Manufacturing: production, inventory, orders, supply chain
    • Human resources:employee records, salaries, tax deductions

13/2/2012 DB2Dr.Suresh Sankaranarayanan 6. Disadvantages of File Processing

  • Data Redundancy and inconsistency
  • Example is address and telephone number stored in a file containing savings and checking account records. This redundancy leads to higher storage and access cost.Also a changed customer address is reflected in savings accountbut not elsewhere in the system resulting in data inconsistency.
  • Accessing Data: Example is let us say bank officer wants to find names ofcustomers who live within the citys zip code say 78733. Asof here, the software was not designed to do this job . So here the officer is left with two choices: one is get list of all customers and extract the needed information manually. The second choice is ask the programmer to writethe necessary application program.
  • Data isolation: Data scattered in various file and files are in different format, it is difficult to write a new program to retrieve the appropriate data.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 7. Disadvantages of File Processing

  • Concurrent Access: Consider say you and your friend are holder of an account say A which got an amount of say $2000.00. let us say both you and friend withdraws say $500 and $200 from the account at the same time.This results in incorrect result say 1500 or 1800 and not 1300. So to guard against it, some form of supervision must be maintained.
  • Security : In banking system, payroll personnel need see some part of database only that has information about various bank employees. As application programs are added in ad hoc manner, it is difficult to enforce such security problems
  • Integrity Problems: Data values stored in database must satisfy certain type of consistency constraint. For example say bank account should not have balance less than say $100.00. But if new constraints are to be added, it has to be enforced in the program.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 8. 13/2/2012 DB2Dr.Suresh Sankaranarayanan Duplicate Data 9. Database Management System DBMS manages data resources like an operating system manages hardware resources A software system that is used to create, maintain, and provide controlled access to user databases Order Filing System Invoicing System Payroll System DBMS Central database Contains employee, order, inventory,pricing, andcustomer data 13/2/2012 DB2Dr.Suresh Sankaranarayanan 10. Advantages of the Database Approach

  • Program-data independence
  • Planned data redundancy
  • Improved data consistency
  • Improved data sharing
  • Increased application development productivity
  • Enforcement of standards
  • Improved data quality
  • Improved data accessibility and responsiveness
  • Reduced program maintenance
  • Improved decision support

13/2/2012 DB2Dr.Suresh Sankaranarayanan 11. Costs and Risks of the Database Approach

  • New, specialized personnel
  • Installation and management cost and complexity
  • Conversion costs
  • Need for explicit backup and recovery
  • Organizational conflict

13/2/2012 DB2Dr.Suresh Sankaranarayanan 12. Components of the Database Environment 13/2/2012 DB2Dr.Suresh Sankaranarayanan 13. Components of theDatabase Environment

  • CASE Toolscomputer-aided software engineering
  • Repositorycentralized storehouse of metadata
  • Database Management System (DBMS) software for managing the database
  • Databasestorehouse of the data
  • Application Programssoftware using the data
  • User Interfacetext and graphical displays to users
  • Data/Database Administratorspersonnel responsible for maintaining the database
  • System Developerspersonnel responsible for designing databases and software
  • End Userspeople who use the applications and databases

13/2/2012 DB2Dr.Suresh Sankaranarayanan 14. Data Abstraction

  • Database System is a collection of interrelated data
  • Set of programmes that allow users to access and modify the data
  • Purpose of database system is to provide users with an abstract view of data
  • Systemhides certain details of how the data are stored and maintained.
  • Three levels of data abstraction
  • Physical Level :
  • Lowest level of abstraction describes how data are actually stored.
  • Describes complex low level data structures in detail.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 15. Data Abstraction

  • Logical level
  • Next higher level of abstraction
  • Describes what data are stored in database and what relationships exist among those data
  • View level:
  • Highest Level of Abstraction
  • Describes only part of the entire database.
  • Application programs hide details of data types.
  • Views can also hide information (such as an employees salary) for security purposes.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 16. Data Abstraction 13/2/2012 DB2Dr.Suresh Sankaranarayanan 17. Instances and Schemas

  • Database change over time as information is inserted and deleted
  • Instance:
  • Collection of Information stored in the database at a particular moment
  • Analogous to a value of variable in a program
  • Schema
  • Overall design of the database- logical structure is called Schema.
  • Corresponds to the variable declarations in a program.
  • Database systems have several Schemas according to the level of abstraction
  • a. Physical Schema: Database design at physical level
  • b. Logical Schema: Database design at logical level.
  • c. Subschemas:Several Schemas at view level that describe different viewsof the database

13/2/2012 DB2Dr.Suresh Sankaranarayanan 18. Instances and Schemas

  • Data Independence: Ability to modify the Schema at one level without affecting the schema definition at another level.
  • Physical Data Independence
  • Ability to modify the physical schema without causing application programs to be rewritten.
  • Modifications at physical level are necessary to improve the performance
  • Logical Data Independence
  • Ability to modify the logical schema without causing application programs to be rewritten.
  • Modifications at logical level are necessary when logical structure of database is altered.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 19. Data Models

  • Data Models : Collection of tools for describing data, data relationships,data semantics anddata constraints
  • Data Models can be classified into four different categories:
  • Relational Model: Collection of tables to represent both data and relationships among those data
  • Entity- Relationship Model: Collection of basic objects called entities and relationship among these objects
  • Object Oriented Data model: Object Oriented data model can be seen as extending E-R model with notions of encapsulation, functions and Object identity. Combines the features of object oriented data model and relational data model
  • Semi structured Data model: Specification of data where individual data items of same type may have different set of attributes. XML is used to representsemi structured data.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 20. Database Languages

  • Provides two languages- Data Definition Language(DDL) and Data Manipulation Language (DML)
  • DDL: Specify Database Schema
  • DML: Express Database queries and updates.
  • DDL and DML are not two separate languages
  • DDL and DML form parts of single database language such as SQL.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 21. Relational Databases

  • Relational database is based on two things :
  • a. Relational model
  • b. Collection of tables to represent both data and relationships among those data.
  • Includes DDL and DML
  • Employs SQL Language.
  • Relational data model is most widely useddata model and majority of database systems are based on relational data model.
  • A sample of relational database comprising of three tables is shown here.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 22. Relational Databases 13/2/2012 DB2Dr.Suresh Sankaranarayanan 23. Application Programs & SQL

  • Sql is not as powerful as a universal tuning machine
  • Some computations cannot be obtained by Sql query
  • These computations must be written in a host language like Cobol, C, C++ or Java with embedded Sql that access the data in database
  • Applications programs are ones that interact with the database in this fashion.
  • Application programs access the database in following two ways :
    • Language extensions to allow embedded SQL
    • Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database

13/2/2012 DB2Dr.Suresh Sankaranarayanan 24. Database Design

  • The process of designing the general structure of the database:
  • Logical Design Deciding on the database schema. Database design requires that we find a good collection of relation schemas.
    • Business decision What attributes should we record in the database?
    • Computer Sciencedecision What relation schemas should we have and how should the attributes be distributed among the various relation schemas?
  • Physical Design Deciding on the physical layout of the database

13/2/2012 DB2Dr.Suresh Sankaranarayanan 25. Database Architecture

  • The architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:
  • Centralized
  • Client-server
  • Parallel (multi-processor)
  • Distributed

13/2/2012 DB2Dr.Suresh Sankaranarayanan 26. Two and Three Tier- Client/Server

  • In Two Tier application resides at the client machine that invokes the database system functionality at the server through query statements. ODBC and JDBC are used for interaction between client and server
  • In Three Tier, client machine acts as merely front end and contains no direct database calls. Instead client end communicates with the application server which in turn communicates with the database system to access data.

13/2/2012 DB2Dr.Suresh Sankaranarayanan 27. Database Users

  • Users are differentiated by the way they expect to interact with
  • the system
  • Application programmers interact with system through DML calls
  • Sophisticated users form requests in a database query language
  • Specialized users write specialized database applications that do not fit into the traditional data processing framework
  • Nave users invoke one of the permanent application programs that have been written previously
    • Examples, people accessing database over the web, bank tellers, clerical staff

13/2/2012 DB2Dr.Suresh Sankaranarayanan 28. Database Administrator

  • Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprises information resources and needs.
  • Database administrator's duties include:
    • Schema definition
    • Storage structure and access method definition
    • Schema and physical organization modification
    • Granting user authority to access the database
    • Specifying integrity constraints
    • Acting as liaison with users
    • Monitoring performance and responding to changes in requirements

13/2/2012 DB2Dr.Suresh Sankaranarayanan 29. System Structure 13/2/2012 DB2Dr.Suresh Sankaranarayanan 30. History of Database Systems

  • 1950s and early 1960s:
    • Data processing using magnetic tapes for storage
      • Tapes provide only sequential access
    • Punched cards for input
  • Late 1960s and 1970s:
    • Hard disks allow direct access to data
    • Network and hierarchical data models in widespread use
    • Ted Codd defines the relational data model
      • Would win the ACM Turing Award for this work
      • IBM Research begins System R prototype
      • UC Berkeley begins Ingres prototype
    • High-performance (for the era) transaction processing

13/2/2012 DB2Dr.Suresh Sankaranarayanan 31. History of Database Systems

  • 1980s:
    • Research relational prototypes evolve into commercial systems
      • SQL becomes industrial standard
    • Parallel and distributed database systems
    • Object-oriented database systems
  • 1990s:
    • Large decision support and data-mining applications
    • Large multi-terabyte data warehouses
    • Emergence of Web commerce
  • 2000s:
    • XML and XQuery standards
    • Automated database administration

13/2/2012 DB2Dr.Suresh Sankaranarayanan