Database & dbms

19

Click here to load reader

description

first year

Transcript of Database & dbms

Page 1: Database & dbms

3

DATABASE AND DATABASE MANAGEMENT SYSTEMS

The database is now the underlying framework of the information system and has fundamentally changed the way many companies and individuals work. In today's technology-intensive economy, most organizations around the world- whether they are for - profit, not -for-profit, educational or governmental - could not stay competitive or achieve their goals without database management systems. Databases touch all aspects of our lives (in fact, the database is now such an integral part of our day-to-day life that often we're not aware we are using one). Users of a database store crucial information - from customers names and suppliers prices to sales history and procurement records - update that information and make it readily available to whoever needs it. People who work with databases are responsible for many of the benefits that computers have offered all kinds of organizations.

These are some examples of Database Applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions

FILE-BASED APPROACH

In the early days, database applications were built on top of file systems. That's the traditional way to use computers to back information systems: to store data in files and process them by means of dedicated programs. The file is a named collection of data regarding similar entities (objects, facts, events). Drawbacks of using file systems to store data: • Data redundancy and inconsistency: multiple file formats, duplication

of information in different files • Difficulty in accessing data: need to write a new program to carry out

each new task • Data isolation : multiple files and formats

Page 2: Database & dbms

4

• Integrity problems : Integrity constraints become part of program code, hard to add new constraints or change existing ones

• Atomicity of updates: failures may leave database in an inconsistent state with partial updates carried out (E.g. transfer of funds from one account to another should either complete or not happen at all)

• Concurrent access by multiple users : concurrent accessed needed for performance, uncontrolled concurrent accesses can lead to inconsistencies (E.g. two people reading a balance and updating it at the same time)

• Security problems These disadvantages were especially pronounced in first and

second-generation application systems. In third-generation systems, a number of powerful support packages and tools have been introduced to minimize some of the disadvantages. This software support provides data dictionaries, high-level programming languages. However, even with these facilities, there remain the fundamental deficiencies of file processing systems: redundant data, low sharing of data, lack of standards and control, and low productivity. To overcome these disadvantages, a new approach emerges in the years 70, the data base approach discussed in the following section. THE DATABASE APPROACH

The database approach represents a different concept in information resource management. Data are viewed as an important, shared resource that must be managed like any other asset, such as people, materials, equipment and money. The data base concept is rooted in an attitude of sharing common data resources, releasing control of those data resources to a common responsible authority, and cooperating in the maintenance of those shared data resources.

The database

A database consists of a shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. The database system provides the organization with centralized control of its data. Such a situation contrasts sharply with that found in an enterprise without a database system, where typically each application has its own private files, so that the data is widely dispersed and might thus be difficult to control in any systematic way.

Page 3: Database & dbms

5

The database is a single, possibly large repository of data, which can be used simultaneously by many departments and users. All data that is required by these users is integrated with the minimum amount of duplication. And, importantly, the database is normally not owned by any one department or user but is a shared corporate resource.

Technically, the database approach differs from the file approach by the following features: • permanent links are established between files to materialize real life

interactions. From here, the technical definition is derived: Database is a named collection of interrelated files, shared by more users.

• data descriptions (logical records of each file) are stored on magnetic support (not in programs but in data files or close). As well as holding the organization's operational data, the database also hold a description of this data. For this reason, a database is also defined as a self-describing collection of integrated records. The description of data (the meta-data) is known as the system catalog or data dictionary. It is the self-describing nature of a database that provides what's known as data independence. This means that the programs consist only on algorithms that are no more data dependent (we can use the same program for more data descriptions) and they use the data description in common.

Program Program

Data description Algorithm Algorithm

Data Description Data Data

File approach Data base approach

Fig. 2.1 File and database approaches

The concept of centralized control implies there will be some

identifiable person who has this central responsibility for the data. That person is the data administrator (DA). It is the data administrator’s job to decide what data should be stored in the database in the first place, and to establish policies for maintaining and dealing with that data once it has

Page 4: Database & dbms

6

been stored. The technical person responsible for implementing the data administrator’s decisions is the database administrator (DBA). Data base management system (DBMS)

The DBMS is a software system that enables users to define, create and maintain the database and also provides controlled access to the database. So, the DBMS is the software that interacts with the users, application programs and the database.

A database management system (DBMS) is the software used to specify the logical organization for a database and access it.

DBMS provides an environment that is both convenient and efficient to use.

Application programs

Users interact with the database through a number of application programs that are used to create and maintain the database and to generate information. These programs can be conventional batch applications or, more typically nowadays, they will be online applications. The application programs may be written in some programming language or in some higher-level fourth-generation language.

An application program is a computer program that interacts with the database by issuing an appropriate request to the DBMS. Views

A DBMS provides a facility known as a view mechanism, which allow each user to have his or her own customized view of the database, where a view is some subset of the database.

A view is a virtual table that does not necessarily exist in the database but is generated by the DBMS from the underlying base tables whenever it's accessed.

A view is usually defined as a query that operates on the base tables to produce another virtual table. As well as reducing complexity by letting users see the data in the way they want to see it, views have several other benefits:

- Views provide a level of security. Views can set up to exclude data that some users should not see.

- Views provide a mechanism to customize the appearance of the database.

Page 5: Database & dbms

7

- A view can present a consistent, unchanging picture of the structure of the database, even if the underlying database is changed.

Components of the DBMS environment We can identify five major components in the DBMS environment: 1. Hardware - the computer systems that the DBMS and the application

programs run on. This can change from a single PC, to a single mainframe, to a network of computers.

2. Software - the DBMS software and the application programs, together

with the operating system, including network software if the DBMS is being used over a network.

3. Data and data descriptions (meta-data) - the data acts like a bridge

between the hardware and software components and the human components.

4. Procedures - the instructions and rules that govern the design and use

of the database. 5. People - In the database environment, human jobs are more specific:

some users deal only with data retrieving (end users), some with developing new information system (application programmers) and some must manage the complex data base environment (data administrator and database administrator). Also, there are other specialists, like database designers (software professionals who specify information, content and create database systems), Web-application developers (create Web pages and devise means for processing information content through the Web) or Web-site designers.

Page 6: Database & dbms

8

.

End Users DBMS group

FMS Programmers

Operating System

Database Administrator BIOS

Database Data dictionary

and directory

Fig 2.2 Software environment in database approach All data handling (storing, updating or retrieving) is done only by

the DBMS. Some database uses the File Management System to store, update and retrieve data in files visible from the Operating system (Dbase, Fox, Paradox). Modern DBMS like Microsoft Access and Oracle don’t use the File Management System, they had their own routines of storing data in tables enclosed in a container seen as a unique file by the Operating System. Sometimes, the DBMS replace totally the FMS and sometimes the DBMS is embedded in the Operating System.

DBMS architectures Before the advent of the Web, generally a DBMS would be divided in two parts: • a client program that handles the main business and data processing

logic and interfaces with the user; • a server program (sometimes called the DBMS engine) that manages

and controls access to the database. This is known as a two-tier client-server architecture.

Page 7: Database & dbms

9

In the mid -1990s, as applications became more complex and potentially could be developed to hundreds and thousands of end-users, the client side of this architecture gave rise to two problems: ü A "fat" client, requiring considerable resources on the client's

computer to run effectively (disk space, RAM and CPU power). ü A significant client-side administration overhead.

By 1995, a new variation of the traditional two-tier client-server model appeared to solve these problems, called three-tier client-server architecture. This new architecture proposed three layers, each potentially running on a different platform: • The user interface layer, which runs on the end-user's computer (the

client). • The business logic and data processing layer - a middle layer which

runs on a server and is often called the application server. One application server is designed to serve multiple clients.

• A DBMS, which stores the data required by the middle layer. This tier may run on a separate server called the database server.

The three-tier design has many advantages over the traditional two-tier

design, such as: ü A "thin" client, which requires less expensive hardware. ü Simplified application maintenance, as a result of centralizing the

business logic for many end-users into a single application server. ü Added modularity, which makes it easier to modify or replace one tier

without affecting the other tiers. ü Easier load balancing, as a result of separating the core business logic

from the database functions. A Transaction Processing Monitor (TPM) -a program that controls data transfer between clients and servers in order to provide a consistent environment for Online Transaction Processing) can be used to reduce the number of connections to the database server.

ü It maps quite naturally to the Web environment, with a Web browser acting as the "thin" client, and a Web server acting as the application server.

Page 8: Database & dbms

10

Functions of a DBMS

A good DBMS should furnish a number of capabilities. The list of features that a DBMS should furnish includes the following:

1. Data storage, retrieval and update: the ability to store, retrieve, and

update the data that is in the database - the fundamental function of a DBMS. Unless a DBMS provides this facility, further discussion of what a DBMS can do is irrelevant. In storing, updating, and retrieving data, it should not be incumbent upon the user to be aware of the system's internal structures or the procedures used to manipulate these structures. This manipulation is strictly the responsibility of the DBMS

2. Meta-data storage, retrieval and update: A user-accessible catalog

in which descriptions of data items are stored and which is accessible to users. A key feature of a DBMS is the provision of an integrated system catalog to hold data about the structure of the database, users, applications, and so on. The catalog is expected to be accessible to users as well as to the DBMS. Typically, the system catalog stores:

- names, types and sizes of data items, - integrity constraints on the data, - names of authorized users who have access to data.

3. Transaction support. A transaction can be defined as being an action,

or series of actions, carried out by a single user or application program, which access or changes the contents of the database. For example, a simple transaction will be to add a new customer in the database, to update the price of one product or a more complex one to delete a sale agent and to reassign his customers to others sales agents. If the transaction fails during execution, the database should be in inconsistent state: some changes will have been made and others not. To overcame this, a DBMS should provide a mechanism that will ensure either that all the updates corresponding to a given transaction are made or that none of them are made. We can use the famous "ACID test" when deciding whether or not a

database management system is adequate for handling transactions. An adequate system has the following properties:

Page 9: Database & dbms

11

ü Atomicity: results of a transaction's execution are either all committed or all rolled back. All changes take effect, or none do. .

ü Consistency: the database is transformed from one valid state to another valid state. This defines a transaction as legal only if it obeys user-defined integrity constraints. Illegal transactions aren't allowed and, if an integrity constraint can't be satisfied then the transaction is rolled back.

ü Isolation: the results of a transaction are invisible to other transactions until the transaction is complete.

ü Durability: once committed (completed), the results of a transaction are permanent and survive future system and media failures.

4. Concurrency control services (support for shared update): a

mechanism to ensure accuracy when several users are updating the database at the same time. Concurrent access is relatively easy if all users are only reading data, as there is no way they can interfere with one another. When two or more users are accessing the database simultaneously and at least one of them is updating data, there may be interference that can result in inconsistencies. One approach that ensures correct results is locking; as long as a portion of the database is locked by one user, other users cannot gain access to it.

5. Recovery services: a mechanism for recovering the database in the

event that the database is damaged in any way. This may be the result of a system crash, media failure, a hardware or software error causing the DBMS to stop, or it may be the result of the user detecting an error during the transaction and aborting the transaction before it completes. In all the cases, the DBMS must provide a mechanism to recover the database to a consistent state. The simplest approach to recovery involves periodically making a copy of the database (called a backup or a save). If a problem occurs, the database is recovered by copying this backup copy over it. In effect, the damage is undone by returning the database to the state it was in when the last backup was made.

6. Security services: a mechanism to ensure that only authorized users

can access the database. A DBMS must furnish a mechanism that restricts access to the database to authorized users. The term security

Page 10: Database & dbms

12

refers to the protection of the database against unauthorized (or even illegal) access, either intentional or accidental.

7. Integrity services: mechanisms to ensure that certain rules are

followed with regard to data in the database and any changes that are made in the data. Data integrity refers to the correctness and consistency of stored data. It can be considered as another type of database protection. While it's related to security, it has wider implications; integrity is concerned with the quality of data itself. Integrity is usually expressed in terms of constraints, which are consistency rules that the database is not permitted to violate. The types of constraints that may be present fall into the following four categories:

§ Data type. The data entered for any column should be consistent with the data type for that column. For a numeric column, only numbers should be allowed to be entered. If the column is a date, only a legitimate date (in the form MMDDYY or MM/DD/YY) should be permitted.

§ Legal values. It may be that for certain columns, not every possible value that is of the right type is legitimate. For example, even though CREDLIM is a numeric column, only the values 400, 500, 700, 800, and 1,000 may be valid.

§ Format. It may be that certain columns have a very special format that must be followed.

§ Key constraints. There are two types of key constraints: primary key constraints and foreign key constraints. Primary key constraints enforce the uniqueness of the primary key. For example, forbidding the addition of a customer whose number matched the number of a customer already in the database would be a primary key constraint. Foreign key constraints enforce the fact that a value for a foreign key must match the value of the primary key for some row in another table. Forbidding the addition of a customer whose sales agent was not already in the database would be an example of a foreign key constraint.

An integrity constraint can be treated in one of four ways: a) The constraint can be ignored, in which case no attempt is made to

enforce the constraint. b) The burden of enforcing the constraint can be placed on the users of

the system. This means that users must be careful that any changes they make in the database do not violate the constraint.

Page 11: Database & dbms

13

c) The burden can be placed on programmers. Logic to enforce the constraint is then built into programs. Users must update the database only by means of these programs and not through any of the built-in entry facilities provided by the DBMS, since these would allow violation of the constraint. The programs are designed to reject any attempt on the part of the user to update the database in such a way that the constraint is violated.

d) The burden can be placed on the DBMS. The constraint is specified to the DBMS, which then rejects any attempt to update the database in such a way that the constraint is violated. The best approach is the last one. Unfortunately, most DBMS don't

have all the necessary capabilities to enforce the various types of integrity. Usually, the approach that is taken is a combination of the (c) and (d) in the foregoing list. We let the DBMS enforce any of the constraints that it is capable of enforcing; application programs enforce other constraints. We might also create a special program whose sole purpose would be to examine the data in the database to determine whether any constraints had been violated; this program would be run periodically. Corrective action could be taken to remedy any violations that were discovered by leans of this program. 8. Support for data communication. Most users access the database

from terminals. Sometimes, these terminals are connected directly to the computer hosting the DBMS. In other cases, the terminals are at remote locations and communicate with the computer hosting the DBMS over a network. In either case, the DBMS must be capable of integrating with networking/communication software,

9. Services to promote data independence: facilities to support the

independence of programs from the structure of the database. One of the advantages of working with a DBMS is data independence; that is, the property that changes can be made in the layout of a database without application programs necessarily being affected. Data independence is normally achieved through a view mechanism; there are usually several types of changes that can be made to the physical characteristics of the database without affecting the views, such as using different file organizations or modifying indexes - this is called physical data independence. However, complete logical data independence is more difficult to achieve; the addition of a new file or

Page 12: Database & dbms

14

field can usually be accommodated, but not their removal (in some systems, any type of change to a file structure is prohibited).

10. Utility services: DBMS-provided services that assist in the general

maintenance of the database. Utility programs help the Database Administrator to manage the database effectively. Following is a list of such services that may be provided by a DBMS.

• Services that permit changes to be made in the database structure (adding new tables or columns, deleting existing tables or columns, changing the name or characteristics of a column, and so on).

• Services that permit the addition of new indexes and the deletion of indexes that are no longer wanted.

• Import and export facilities from other software products. For example, these services allow data to be transferred in a relatively easy fashion between the DBMS and a spreadsheet, word processing, or graphics program, or to load and unload data from or to flat files.

• Monitoring facilities, to monitor database usage and operation. • Several of the services that form a part of the fourth-generation

environment are also furnished by some of the better DBMS. These include such things as easy-to-use edit and query capabilities, screen generators, report generators, and so on.

• Access to both procedural and nonprocedural languages. • An easy-to-use graphical user interface that allows users to tap the

power of the DBMS without having to resort to a complicated set of commands. The actual level of functionality offered by a DBMS differs from

product to product. For example, a DBMS for a PC may not support concurrent shared access, and it may only provide limited security, integrity and recovery control. Modern, large multi-user DBMS products offer all the above functions and much more. DATABASE ADMINISTRATION AND SECURITY Data administration and database administration

The Data Administrator (DA) and Database Administrator (DBA) are responsible for managing and controlling the activities associated with the corporate data and the corporate database, respectively. Depending on

Page 13: Database & dbms

15

the size and complexity of the organization and database system, the DA and DBA can be the responsibility of one or more people.

Data administration - the management and control of the corporate data, including database planning, development and maintenance of standards, policies and procedures, and logical database design. The DA is responsible for the corporate data, which includes non-computerized data, and in practice is often concerned with managing the shared data of users or business application areas of an organization. He must ensure that the application of database technologies supports the corporate objectives.

Database administration - the management and control of the corporate database system, including physical database design and implementation, setting security and integrity controls, monitoring system performance, and reorganizing the database as necessary. The DBA is more technically oriented than the DA, requiring knowledge of specific DBMSs and the operating system environment. The primary responsibilities of the DBA are centered on developing and maintaining systems using the DBMS software to its full extent.

In some organizations, data administration is a distinct business area, in others it may be combined with database administration.

Data administration Database administration Involved in strategic IS planning Evaluates new DBMSs Determines long-term goals Executes plans to achieve goals Determines standards, policies and procedures

Enforces standards, policies and procedures

Determined data requirements Implements data requirements Develops logical database design Develops physical database design Develops and maintains corporate data model

Implements physical database design

Coordinates database development Monitors and controls database use Managerial orientation Technical orientation DBMS independent DBMS dependent

Database security Database security is the mechanism that protect the database against intentional or accidental threats. Database security encompasses hardware, software, people and data. This need for security is due to the increasing amounts of crucial corporate data being stored on computer and the

Page 14: Database & dbms

16

acceptance that any loss or unavailability of this data could be potentially disastrous. A database represents nowadays an essential corporate resource that should be properly secured using appropriate controls. Database security is considered in relation to the following outcomes: - theft and fraud, - loss of confidentiality (secrecy), - loss of privacy, - loss of integrity, - loss of availability.

An organization needs to identify the types of threats it may be subjected to (we understand by threats any situations or events, whether intentional or unintentional, that may adversely affect a system and consequently the organization) and initiate appropriate plans and countermeasures, considering also the costs of implementing them. The types of countermeasures to threats on database systems range from physical controls to administrative procedures. Despite the range of computer-based controls that are available, generally, the security of a DBMS is only as good as that of the operating system, owing to their close association. The most widely used computer-based security controls for a multi-user environment are: 1) Authorization (access control) - the granting of a right or privilege

that enables a subject to have legitimate access to a database system or a database system's object. The process of authorization involves authentication (a mechanism that determines whether a user is who he or she claims to be) of a subject (a user) requesting access to an object (a database table, view, procedure or any other object that can be created within the database system). A system administrator is usually responsible for permitting user's access, by creating individual users accounts and passwords. Once a user is given permission to use a DBMS, various other privileges may also be automatically associated with it. Privileges are granted to users to accomplish the tasks required for their jobs.

2) Views - virtual tables that does not necessarily exist in the database

but can be produced upon request by a particular user, at the time of request. The view mechanism provides a powerful and flexible security mechanism by hiding parts of the database from certain users.

Page 15: Database & dbms

17

3) Backup and recovery - the process of periodically taking a copy of

the database and log file (and possibly programs) onto offline storage media in order to assist the recovery of the database following failure.

4) To keep track of database transactions, the DBMS maintains a special file called a log file (or journal) that contains information about all updates to the database. A DBMS should provide logging facilities, sometimes referred to as journaling, which keep track of the current state of transactions and database changes, to provide support for recovery procedures.

5) Integrity -integrity constraints contribute to maintaining a secure

database system by preventing data from becoming invalid, and hence giving misleading or incorrect results.

6) Encryption - the encoding of the data by a special algorithm that

renders the data unreadable by any program without the decryption key.

7) Redundant Array of Independent Disks (RAID) - the hardware that

DBMS is running on must be fault-tolerant, meaning that the DBMS should continue to operate even if one of the hardware components fails. RAID technology works by having a large disk array comprising an arrangement of several independent disks that are organized to improve reliability and at the same time to increase performance.

DATABASE APPROACH - ADVANTAGES AND DISADVANTAGES The main benefits of the database approach are: 1. Control of data redundancy

The database approach eliminates redundancy where possible; previously separate (and redundant) data files are integrated into a single, logical structure. In addition, each data item occurrence is ideally recorded in only one place in the database. That doesn’t mean that all redundancy can or should be eliminated. Sometimes there are valid reasons for storing

Page 16: Database & dbms

18

multiple copies of the same data. However, the amount of redundancy inherent in the database is controlled. 2. Data consistency

By controlling (or eliminating) data redundancy, we greatly reduce the risk of inconsistencies occurring. If data is stored only once in the database, any update to it's value has to be performed only once and the new value is immediately available to all users. When controlled redundancy is permitted in the database, the database system itself should enforce consistency by updating each occurrence of a data item when a change occurs – that means that the DBMS could guarantee that the database is never inconsistent as seen by the user, by ensuring that any change made to either of the two entities is automatically applied to the other one also (process known as “propagating updates”). However, few commercially available systems today are capable of automatically propagating updates in this manner; most current products do not support controlled redundancy at all, except in certain special situations.

3. Sharing of data

In a file-based approach, typically files are owned by the people or departments that use them. On the other hand, the database belongs to the entire organization and can be shared by all authorized users. Sharing means not only that existing applications can share the data in the database, but also that new applications can be developed to operate against that same stored data. In other words, it might be possible to satisfy the data requirements of new applications without having to create any additional stored data. The new applications can also rely on the functions provided by the DBMS, such as data definition and manipulation, concurrency and recovery control, rather than having to provide these functions themselves. 4. Improved data integrity

The problem of integrity is the problem of ensuring that the data in the database is accurate. Database integrity is usually expressed in terms of constraints, which are consistency rules that the database is not permitted to violate. Inconsistency between two entries that purport to represent the same “fact” is an example of lack of integrity; that particular problem can arise only if redundancy exists in the stored data. Even if there is no redundancy, however, the database might still contain incorrect information.

Page 17: Database & dbms

19

Centralized control of the database can help in avoiding such problems – insofar as they can be avoided – by permitting the data administrator to define (and the DBA to implement) integrity rules to be checked whenever any data update operation is attempted.

It is worth pointing out that data integrity is even more important in a multi-user database than it is in a “private files” environment, precisely because the database is shared. For without appropriate controls it would be possible for one user to update the database incorrectly, thereby generating bad data and so “infecting” other innocent users of that data. 5. Standards can be enforced

Establishing the data administration function is an important part of the database approach. This organizational function has authority for defining and enforcing data standards. With central control of the database, data base administrator can ensure that all applicable standards are observed in the representation of the data. Applicable standards might include any or all of the following: corporate, installation, departmental, industry, national and international standards. Standardizing data representation is particularly desirable as an aid to data interchange, or migration of data between systems. Likewise, data naming and documentation standards are also very desirable as an aid to data sharing and understandability. 6. Improved security

The data administration function has complete jurisdiction over the database and is responsible for establishing controls for accessing, updating and protecting data. The DBA can ensure that the only means of access to the database is through the proper channels, and hence can define security rules to be checked whenever access is attempted to sensitive data. Different rules can be established for each type of access to each piece of information in the database. Without such rules the security of data might actually be more at risk than in a traditional (dispersed) filing system; centralized nature of a database system in a sense requires that a good security system be in place also. 7. Conflicting requirements can be balanced

Knowing the overall requirements of the organization – as opposed to the requirements of individual users – the DBA can so structure the system as to provide an overall service that is “best for the organization”. For example, a representation can be chosen for the data in storage that gives

Page 18: Database & dbms

20

fast access for the most important applications (possibly at the cost of poorer performance for certain other applications). 8. Increased productivity

A major advantage of the database approach is that the cost and time for developing new business applications are greatly reduced. Once the database has been designed and implemented, a programmer can code and debug a new application at least two to four times faster than with conventional data files; the reason for this improvement is that the programmer is no longer saddled with the burden of designing, building and maintaining master files. 9. The provision of data independence

Applications implemented on older systems tend to be data-depended. What this means is that the way in which the data is organized in secondary storage, and the technique for accessing it, are both dictated by the requirements of the application under consideration, and moreover that knowledge of that data organization and that access technique is built into the application logic and code. It is impossible to change the storage structure (how the data is physically stored) or access technique (how it is accessed) without affecting the application, probably drastically.

In a database system, however, it would be extremely undesirable to allow applications to be data-dependent, for at least the following two reasons: • Different applications will need different views of the same data • The DBA must have the freedom to change the storage structure or

access technique in response to changing requirements, without having to modify existing applications. For example, new kinds of data might be added to the database; new standards might be adopted; application priorities might change; new types of storage device might become available; and so on. If applications are data-depended, such changes will typically require corresponding changes to be made to programs, thus typing out programmer effort that would otherwise be available for the creation of new applications.

It follows that the provision of data independence is a major objective of database systems. Data independence can be defined as the immunity of applications to change in storage structure and access technique. The database should be able to grow without affecting existing applications; that is probably the single most important reason for requiring data independence in the first place.

Page 19: Database & dbms

21

However, data independence is not an absolute – different systems provide it in different degrees; in fact, few systems, if any, provide no data independence at all – it is just that some systems are less data-dependent than others. There are, however, some disadvantages of the database approach, such as: 1. Complexity. A DBMS is an extremely complex piece of software, and

all users (database designers and developers, database administrators and end-users) must understand the DBMS functionality to take full advantage of it.

2. Cost of DBMS. The cost of DBMS varies significantly, depending on the environment and functionality provided. There is also the recurrent annual maintenance cost, which is a percentage of the list price.

3. Cost of conversion. In some situations, the cost of the DBMS and extra hardware may be insignificant compared with the cost of converting existing applications to run on the new DBMS and hardware. This cost is one of the main reasons why some companies feel tied with their current systems and cannot switch to more modern database technology.

4. Performance. Typically, a file-based system is written for a specific application, such as invoicing. As a result, performance is generally very good. A DBMS is written to be more general, to cater for many applications rather that just one. The effect is that some applications may not run as fast using a DBMS as they did before.

5. Higher impact of a failure. The centralization of resources increases the vulnerability of the system. Since all users and applications rely on the availability of the DBMS, the failure of any component can bring operations to a complete halt until the failure is repaired.