Computer Support Cooperative Work (CSCW) Facilitating work by more than one person
by one computer
-
Upload
databaseguys -
Category
Documents
-
view
201 -
download
0
Transcript of by one computer
1
Sharing Enterprise
Data
2
Enterprise Database Processing Architectures
• Many organizations have a variety of database architectures. Enterprise database processing is concerned with the challenges associated with merging these different architectures into a single view of the organizational data.
3
Database Processing Architectures
• Teleprocessing Systems
• Client-Server Systems
• File-Sharing Systems
• Distributed Database Systems
4
Teleprocessing Systems
• All processing is done by one computer. Users may use dumb terminals to transmit information to the centralized computer.
5
Relationship of Programs in a Teleprocessing System
6
Client-Server Systems
• Client-Server processing is a form of cooperative computing. Client computers and servers, using a network, share the computing burden. DBMS functionality is provided by one computer, typically the server.
7
Client-Server Architecture
8
File-Sharing Systems
• Files are shared between servers and client computers. The server does not provide DBMS functionality.
9
File-Sharing Architecture
10
Distributed Database Systems
• Distributed Databases store portions of the database on multiple systems that are interconnected using a network. As such, no one system contains the entire database.
11
Distributed Database Architecture
12
• Two terms are common
– Partitioning– Replication
13
Types of Distributed Databases
• Non-partitioned,non-replicated
• Partitioned, non-replicated
• Non-partitioned, replicated
• Partitioned, replicated
14
Comparison of distributed database alternatives
• Parallelism
• independence
• flexibility
• availability
• cost/complexity
• difficulty of control
• security of risk
15
Downloading Data
• Data may be pulled from a server-based DBMS and downloaded to a client.
• Used for query and report
• cannot be updated
16
computer1
Downloaded
DB
computer2
Downloaded
DB
GatewayComputer
Mainframe
TP Terminal
TP Terminal
Teleprocessing
DB
Downloaded
DBLAN
17
Issues in Downloading Data
• Coordination
– downloaded data must conform to database constraint
– local updates must be coordinated with downloads
18
Issues in Downloading Data
• Consistency
– In general ,downloaded data should not be updated
– Applications need features to prevent updating
– users should be made aware of possible problems
19
Issues in Downloading Data
• Access Control
– Data may be replicated on many computers
– procedures to control data access are more complicated
20
Issues in Downloading Data
• Computer Crime
– Illegal copying is difficult to prevent
– Diskettes and illegal online access are easy to conceal
– Risk may prevent the development of downloaded data applications
21
What is OLAP?
• OLAP (oh-lap) is an on-line system that analyzes and presents data in a particular manner.
• enables analysts, managers and executives to gain insight
– fast, consistent, interactive access
– multidimensional view
22
Technical aspects• Your Express database may reside locally on
your PC,anywhere on your LAN,anywhere on
your company's intranet nor even anywhere
on the internet.
• Olap Table can be used with any
development environment supporting ActiveX
controls such as Microsoft Visual Basic,
Microsoft Visual C++, Borland Delphi, Borland
C++ Builder or Microsoft FrontPage.
23
OLAP
• The data categories are called axes or dimensions. This is termed an OLAP Cube.
• There are no limits on the number of axes. If a large number of axes are used, it is termed an OLAP Hypercube
24
Relational Source Data for an OLAP Cube
25
An OLAP Cube
26
27
Easy to Build
28
Easy Drill down
29
PUBLISHING
30
Easy to Rotate
31
Datawarehouse
• Downloading data -
– data moves closer to the user
– problems ??
• One or two can be managed ,but think about many
– solution
• datawarehouse
–makes data more useful
32
Data Warehouse
33
Data Warehouse
• A data warehouse is a store of enterprise data (and procedures) that is designed to facilitate management decision making
• A data warehouse includes data, tools, procedures, training, personnel, and other resources that are required or that make decision making easier
• The data comes from many different sources and may output to many different sources
34
Data Warehouse Components
35
Categories of Data Warehouse Requirements
36
Data Warehouse Challenges
• Inconsistent Data
• Tool Integration
• Missing Warehouse Data Management Tools
• Ad Hoc Nature of Requirements
37
Data Mart
• A data mart is a facility akin to a data warehouse but for a much smaller domain
• The goal of the data mart is to provide the functionality of a data warehouse within a limited domain
38
Datawarehouse
• Downloading data -
– data moves closer to the user
– problems ??
• One or two can be managed ,but think about many
– solution
• datawarehouse
–makes data more useful
39
Review from previous class
• Database architectures
– Teleprocessing
– client server
– fileserver
– distributed
40
Why OLAP not RDBMS??• RDMS is good for
– online transaction processing
– static reporting
• RDMS can’t handle
– due to the SQL limitation relational model is incapable of analytical solution
– sequential processing (ratios,percents,period -to-period comparisons)
– for reporting- can’t break rows,subtotals,numbering,rankings, or moving averages
41
Data downloading,Data Mart and
Data warehouse• Data Downloading
– smallest and easiest alternative
– downloaded data are provided on a regular and recurring basis
– less timing and domain inconsistencies
• Data Mart
– particular business function
– same type of user
• Data Warehouse
– expensive,more difficult
– provides data for recurring and ad-hoc basis
42
Data Administration
• Data is an critical and expensive to acquire resource to an organization. As such, careful administrative procedures and controls are required.
• Protect data,use effectively
43
Data Administration Challenges
44
Managing Multi-User Databases
• Serving the needs of multiple users and multiple applications adds complexity in…
– design,
– development, and
– migration (future updates)
45
Multi-User Database Issues include…
• Interdependency
– Changes required by one user may impact others
• Concurrency
– People or applications may try to update the same information at the same time
46
Multi-User Database Issues include… (continued)
• Record Retention
– When information should be discarded
• Backup/Recovery
– How to protect yourself from losing critical information
47
Common Multi-User DBMS
• Windows 2000– Access 2000
– SQL Server
– ORACLE
• UNIX– ORACLE
– Sybase
– Informix
48
Role of the Database Administrator
• Organizations typically hire a database administrator (DBA) to handle the issues and complexities associated with multi-user databases.
• A DBA facilitates the development and use of one or more databases.
49
Data Administrator versus Database Administrator
• Data Administrator
– Handle the database functions and responsibilities for the entire organization.
• Database Administrator (DBA)– Handle the
functions associated with a specific database, including those applications served by the database.
50
The Characteristics of a DBA
• Technical– The DBA is responsible for the
performance and maintenance of one or more databases.
• Diplomatic– The DBA must coordinate the efforts,
requirements, and sometimes conflicting goals of various user groups to develop community-wide solutions.
51
Technical Skills of the DBA
• Managing the database structure• Controlling concurrent processing• Managing processing rights and
responsibilities• Developing database security• Providing database recovery• Managing the database management system
(DBMS)• Maintaining the data repository
52
Managing the Database Structure
• Managing the database structure includes configuration control and documentation regarding:– The allocation of space– Table creation– Indices creation– Storage procedures– Trigger creation
53
Configuration Control
• The database configuration must reflect changes in organizational and user requirements
• Procedures and policies should be included
• Sometimes configuration changes have unanticipated consequences
• DBA must be prepared to debug and repair unforeseen issues.
54
The Need for Documentation
• When altering a databases structure, unanticipated issues are inevitable
• In recording the specific changes, dates, and times, it is easier to determine the root cause of issues and to resolve the issues
• When historical data is restored, it must be reformatted with all the changes in the database structure since the data was originally saved.
55
Documentation
• All structural changes must be carefully documented with the following:– Reason for change– Who made the changes – Specifically what was changed– How and when the changes were
implemented– How were the changes tested and what
were the results
56
Documentation Aids
• Version Control and Computer Assisted Software Engineering (CASE) toolsautomate and/or manage many tedious documentation tasks.
• Printing the data dictionaries after structural changes also helps eliminate many tedious documentation tasks
57
Controlling Concurrency Processing
• Measures are taken to prevent that one user’s actions do not adversely impact another user’s actions
• At the core of concurrency is accessibility. In one extreme, data becomes inaccessible once a user touches the data. This ensures that data that is being considered for update is not shown. In the other extreme, data is always readable. The data is even readable when it is locked for update.
58
Aspects of Concurrency Control
• Rollback/Commit: Ensuring all actions are successful before posting to the database
• Multitasking: Simultaneously serving multiple users
• Lost Updates: When one user’s action overwrites another user’s request
59
Rollback/Commit
• A database operation typically involves several transactions. These transactions are atomic and are sometimes called logical units of work (LUW).
• Before an operation is committed to the database, all LUWs must be successfully completed. If one or more LUW is unsuccessful, a rollback is performed and no changes are saved to the database.
60
Lost Update Problem
• If two or more users are attempting to update the same piece of data at the same time, it is possible that one update may overwrite another update.
• Resource locking scenarios are designed to address this problem
61
Resource Locking
• A resource lock prevents a user from reading and/or writing to a piece of data
• Locks may be applied to:
– a single data item (value)
– an entire row of a table
– a page (memory segment) (many rows worth)
– an entire table
– an entire database
• This is referred to as the Lock granularity
62
Types of Resource Locks
• Implicit versus Explicit
– Implicit locks are issued automatically by the DBMS based on an activity
– Explicit locks are issued by users requesting exclusive rights to the data
• Exclusive versus Shared
– An exclusive lock lock prevents others from reading or updating the data
– A shared lock allows others to read, but not update the data
63
Deadlocks
• As a transaction begins to lock resources, it may have to wait for a particular resource to be released by another transaction. On occasions, two transactions maybe indefinitely waiting on one another to release resources. This condition is known as a deadlock or a deadly embrace.
64
Avoiding Deadlocks
• Strategy 1:– Wait until all resources are available,
then lock them all before beginning• Strategy 2:
– Establish and use clear locking orders/sequences
• Strategy 3:– Once detected, the DBMS will rollback
one transaction
65
Resource Locking Strategies
• Optimistic Locking
– Read data
– Process transaction
– Issue update
– Look for conflict
– If conflict occurred, rollback and repeat or else commit
• Pessimistic Locking
– Lock required resources
– Read data
– Process transaction
– Issue update
– Release locks
66
Database Security
• Database security strives to ensure…
– Only authorized users perform authorized activities at authorized times
67
Managing Processing Rights and Responsibilities
• Processing rights define who is permitted to do what, when
• The individuals performing these activities have full responsibility for the implications of their actions
• Individuals are identified by a username and a password
68
Granting of Processing Rights
• Database users are known as an individual and as a member of one or more role
• Access and processing rights/privileges may be granted to an individual and/or a role
• Users possess the compilation of rights granted to the individual and all the roles for which they are members
69
Granting Privileges
70
Providing Database Recovery
• Common causes of database failures…– Hardware failures– Programming bugs– Human errors/mistakes
• Since these issues are impossible to completely avoid, recovery procedures are essential
71
Database Recovery Characteristics
• Continuing business operations (Fall-back procedures/Continuity planning)
• Restore from backup
• Replay database activities since backup was originally made
72
Fall-back Procedures/Continuity Planning
• The business will continue to operate even when the database is inaccessible
• The fall-back procedure defines how the organization will continue operations
• Careful attention must be paid to…– saving essential data– continuing to provide quality service
73
Restoring from Backup
• In the event that the system must be rebuilt or reloaded, the database is restored from the last full backup.
• Since it is inevitable that activities occurred since the last full backup was made, subsequent activities must be replayed/restored.
74
Recovery via ReprocessingThe database is periodically backed up (a database save) and all transactions applied since the last save are recorded
If the system crashes, the latest database save is restored and all of the transactions are re-applied (by users) to bring the database back up to the point just before the crash.
Several shortcomings:
» Time required to re-apply transactions
» Re-applying concurrent transactions is not straight forward.
75
Recovery via Rollback/Rollforward
We apply a similar technique: Make periodic saves of the database (time consuming operation). However, maintain a more intelligent log of the transactions that have been applied. This transaction log Includes before images and after images
76
Rollforward
Before Image: A copy of the table record (or page) of data before it was changed by the transaction.
After Image: A copy of the table record (or page) of data after it was changed by the transaction.
77
Rollback/RollforwardRollback: Undo any partially completed transactions (ones in progress when the crash occurred) by applying the before images to the database.
Rollforward: Redo the transactions by applying the after images to the database. This is done for transactions that were committed before the crash.
78
Database Recovery
Recovery process uses both rollback and rollforward to restore the database.
In the worst case, we would need to rollback to the last database save and then rollforward to the point just before the crash.
Note : Most database management systems provide a mechanism to record activities into a log file.
79
Managing the Database Management System (DBMS)
• In addition to controlling and maintaining the users and the data, the DBA must also maintain and monitor the DBMS itself.
– Performance statistics (performance tuning/optimizing)
– System and data integrity
– Establishing, configuring, and maintaining database features and utilities
80
Maintaining the Data Repository
• The data repository contains metadata. Metadata is data about data.
• The data repository specifies the name, type, size, format, structure, definitions, and relationships among the data. They also contain the details about applications, users, add-on products, etc.
81
Types of Data Repositories
• Active data repository
– The development and management tools automatically maintain and upkeep the metadata.
• Passive data repository
– People manually maintain and upkeep the metadata