1 Distributed Databases BUAD/American University Distributed Databases.
-
Upload
ralf-murphy -
Category
Documents
-
view
222 -
download
3
Transcript of 1 Distributed Databases BUAD/American University Distributed Databases.
1Distributed DatabasesBUAD/American University
Distributed Databases
2Distributed DatabasesBUAD/American University
Definitions
• Distributed Database: A single logical database that is spread physically across computers in multiple locations (possibly global) that are connected by a data communications link.
• Decentralized Database: A collection of independent databases on non-networked computers. (possibly global)
3Distributed DatabasesBUAD/American University
Reasons forDistributed Database
• Local business units want control over data.
• Consolidate data across local databases for integrated decision making.
• Reduce telecommunications costs.
• Reduce the risk of telecommunications failures.
4Distributed DatabasesBUAD/American University
Distributed Database Options
• Homogeneous - Same DBMS at each node.
• Heterogeneous - Different DBMSs at different nodes.
• Systems - Supports some or all of the functionality of one logical database.
5Distributed DatabasesBUAD/American University
Homogeneous, Non-Autonomous Database
• Data is distributed across all the nodes.
• Same DBMS at each node.
• All data is managed by the distributed DBMS (no exclusively local data.)
• All access is through one, global schema.
• The global schema is the union of all the local schema.
6Distributed DatabasesBUAD/American University
Focus on The Following Heterogeneous Environment
• Data distributed across all the nodes.
• Different DBMSs may be used at each node.
• Local access is done using the local DBMS and schema.
• Remote access is done using the global schema.
7Distributed DatabasesBUAD/American University
Objectives and Trade-offs
• Location Transparency - User does not have to know the location of the data.
• Local Autonomy - Local site can operate with its database when central site is down.
• Synchronous Distributed Database - All copies of the same data are always identical.
• Asynchronous Distributed Database - Some data inconsistency is tolerated.
8Distributed DatabasesBUAD/American University
Advantages ofDistributed Database
• Increased reliability and availability.
• Local control over data.
• Modular growth.
• Lower communication costs.
• Faster response for certain queries.
9Distributed DatabasesBUAD/American University
Disadvantages ofDistributed Database
• Software cost and complexity.
• Processing overhead.
• Data integrity exposure.
• Slower response for certain queries.
10Distributed DatabasesBUAD/American University
Options forDistributing a Database
• Data replication.
• Horizontal partitioning.
• Vertical partitioning.
• Combinations of the above.
11Distributed DatabasesBUAD/American University
Data Replication
• Advantages -– Reliability.
– Fast response.
– May avoid complicated distributed transaction integrity routines (if replicated data is refreshed at scheduled intervals.)
– De-couples nodes (transactions proceed even if some nodes are down.)
– Reduced network traffic at prime time (if updates can be delayed.)
12Distributed DatabasesBUAD/American University
Data Replication
• Disadvantages -– Additional requirements for storage space.– Additional time for update operations.– Complexity and cost of updating.– Integrity exposure of getting incorrect data if
replicated data is not updated simultaneously.
• Therefore, better when used for non-volatile data.
13Distributed DatabasesBUAD/American University
Types of Data Replication
• Snapshot Replication -
– Changes are periodically sent to a master site which sends an updated snapshot out to the other sites.
• Near Real-Time Replication -
– Broadcast update orders without requiring confirmation.
• Pull Replication -
– Each site controls when it wants updates.
14Distributed DatabasesBUAD/American University
Issues in Data Replication Use
• Data timeliness.
• Useful if DBMS cannot reference data from more than one node.
• Batched updates can cause performance problems.
• Updates complicated with heterogeneous DBMSs or database design.
• Telecommunications speeds may limit mass updates.
15Distributed DatabasesBUAD/American University
Horizontal Partitioning
• Different records of a file at different sites.
• Advantages -– Data stored close to where it is used.– Local access optimization.– Security.
• Disadvantages– Accessing data across partitions.– No data replication.
16Distributed DatabasesBUAD/American University
Vertical Partitioning
• Different columns of a file at different sites.
• Advantages and disadvantages are the same as for horizontal partitioning except that combining data across partitions is more difficult because it requires joins.
17Distributed DatabasesBUAD/American University
Five Distributed Database Organizations
Centralized database, distributed access.Replication with periodic snapshot update.Replication with near real-time
synchronization of updates.Partitioned, one logical database.Partitioned, independent, non-integrated
segments.
18Distributed DatabasesBUAD/American University
Factors in Choice ofDistributed Strategy
• Funding, autonomy, security.
• Site data referencing patterns.
• Growth and expansion needs.
• Technological capabilities.
• Costs of managing complex technologies.
• Need for reliable service.
19Distributed DatabasesBUAD/American University
Requirements for aDistributed DBMS
• Ability to locate data with a distributed data dictionary.
• Determine the location from which to retrieve data and the location at which to process each part of a distributed query.
• Heterogeneous DBMS translation.• Security, concurrency, query optimization, failure
recovery.• Consistency of replicated data.