Md Rezaul Huda Chowdhury Reza. Concurrency Control Modify concurrency control schemes for use in...
-
Upload
thomasine-davis -
Category
Documents
-
view
231 -
download
0
description
Transcript of Md Rezaul Huda Chowdhury Reza. Concurrency Control Modify concurrency control schemes for use in...
Md Rezaul Huda Chowdhury Reza
Modify concurrency control schemes for use in distributed environment.
We assume that each site participates in the execution of a commit protocol to ensure global transaction automicity.
We assume all replicas of any item are updated
Md Rezaul Huda Chowdhury Reza
System maintains a single lock manager that resides in a single chosen site, say Si
When a transaction needs to lock a data item, it sends a lock request to Si and lock manager determines whether the lock can be granted immediatelyIf yes, lock manager sends a message to the site
which initiated the requestIf no, request is delayed until it can be granted, at
which time a message is sent to the initiating siteMd Rezaul Huda Chowdhury Reza
The transaction can read the data item from any one of the sites at which a replica of the data item resides.
Writes must be performed on all replicas of a data item
Advantages of scheme:Simple implementationSimple deadlock handling
Disadvantages of scheme are:Bottleneck: lock manager site becomes a
bottleneckVulnerability: system is vulnerable to lock
manager site failure.Md Rezaul Huda Chowdhury Reza
Distributed Lock ManagerIn this approach, functionality of locking is
implemented by lock managers at each siteLock managers control access to local data items
But special protocols may be used for replicasAdvantage: work is distributed and can be
made robust to failuresDisadvantage: deadlock detection is more
complicatedLock managers cooperate for deadlock detection
Several variants of this approachPrimary copyMajority protocolBiased protocolQuorum consensus
Md Rezaul Huda Chowdhury Reza
Consider the following two transactions and history, with item X and transaction T1 at site 1, and item Y and transaction T2 at site 2:
T1: write (X)write (Y)
T2: write (Y)write (X)
X-lock on Xwrite (X)
X-lock on Ywrite (Y)
wait for X-lock on XWait for X-lock on Y
Result: deadlock which cannot be detected locally at either site
Md Rezaul Huda Chowdhury Reza
A global wait-for graph is constructed and maintained in a single site; the deadlock-detection coordinatorReal graph: Real, but unknown, state of the system.Constructed graph:Approximation generated by the
controller during the execution of its algorithm .the global wait-for graph can be constructed when:
a new edge is inserted in or removed from one of the local wait-for graphs.
a number of changes have occurred in a local wait-for graph.
the coordinator needs to invoke cycle-detection.If the coordinator finds a cycle, it selects a victim
and notifies all sites. The sites roll back the victim transaction.
Md Rezaul Huda Chowdhury Reza
AvailabilityHigh availability: time for which system is not
fully usable should be extremely low (e.g. 99.99% availability)
Failures are more likely in large distributed systems
To be robust, a distributed system must Detect failuresReconfigure the system so computation may continueRecovery/reintegration when a site or link is repaired
Failure detection: distinguishing link failure from site failure is hard (partial) solution: have multiple links, multiple link
failure is likely a site failureMd Rezaul Huda Chowdhury Reza
ReconfigurationReconfiguration:
Abort all transactions that were active at a failed site Making them wait could interfere with other transactions
since they may hold locks on other sites However, in case only some replicas of a data item failed,
it may be possible to continue transactions that had accessed data at a failed site (more on this later)
If replicated data items were at failed site, update system catalog to remove them from the list of replicas. This should be reversed when failed site recovers, but
additional care needs to be taken to bring values up to date
If a failed site was a central server for some subsystem, an election must be held to determine the new server E.g. name server, concurrency coordinator, global
deadlock detectorMd Rezaul Huda Chowdhury Reza
Reconfiguration (Cont.)Since network partition may not be
distinguishable from site failure, the following situations must be avoidedTwo or more central servers elected in distinct
partitionsMore than one partition updates a replicated data item
Updates must be able to continue even if some sites are down
Solution: majority based approachAlternative of “read one write all available” is
tantalizing but causes problems
Md Rezaul Huda Chowdhury Reza
Site ReintegrationWhen failed site recovers, it must catch
up with all updates that it missed while it was downProblem: updates may be happening to items
whose replica is stored at the site while the site is recovering
Solution 1: halt all updates on system while reintegrating a site Unacceptable disruption
Solution 2: lock all replicas of all data items at the site, update to latest version, then release locks Other solutions with better concurrency also
availableMd Rezaul Huda Chowdhury Reza
Heterogeneous Distributed DatabasesMany database applications require data from a
variety of preexisting databases located in a heterogeneous collection of hardware and software platforms
Data models may differ (hierarchical, relational , etc.)
Transaction commit protocols may be incompatibleConcurrency control may be based on different
techniques (locking, time stamping, etc.)System-level details almost certainly are totally
incompatible.A multi database system is a software layer on top
of existing database systems, which is designed to manipulate information in heterogeneous databasesCreates an illusion of logical database integration
without any physical database integration
Md Rezaul Huda Chowdhury Reza
Preservation of investment in existinghardwaresystem softwareApplications
Local autonomy and administrative control Allows use of special-purpose DBMSsStep towards a unified homogeneous DBMS
Full integration into a homogeneous DBMS faces Technical difficulties and cost of conversion Organizational/political difficulties
Organizations do not want to give up control on their data
Local databases wish to retain a great deal of autonomy
Md Rezaul Huda Chowdhury Reza
Unicast, Broadcast versus Multicast
Unicast One-to-one Destination – unique
receiver host addressBroadcast
One-to-all Destination – address of
networkMulticast
One-to-many Multicast group must be
identified Destination – address of
group
Key:
Unicast transfer
Broadcast transfer
Multicast transfer
Md Rezaul Huda Chowdhury Reza
Multicast application examplesFinancial services
Delivery of news, stock quotes, financial indices, etc
Remote conferencing/e-learningStreaming audio and video to many participants
(clients, students)Interactive communication between participants
Data distributione.g., distribute experimental data from Large Hadron
Collider (LHC) at CERN lab to interested physicists around the world
Md Rezaul Huda Chowdhury Reza
•Highly efficient bandwidth usageKey Architectural Decision: Add support for multicast in IP layer
Berkeley
Gatech Stanford
CMU
Routers with multicast support
Md Rezaul Huda Chowdhury Reza
So what is the big issue …
more than 20 years since proposal, but no wide area IP multicast deployment
Scalability (with number of groups)-- Routers maintain per-group state
IP Multicast: best-effort multi-point delivery service-- Providing higher level features such as reliability, congestion
control, flow control, and security has shown to be more difficult than in the unicast case
Can we achieve efficient multi-point delivery without IP-layer support?
Md Rezaul Huda Chowdhury Reza
Stanford
CMU
Stan1
Stan2
Berk2
Overlay Tree
Gatech
Berk1
Berkeley
Gatech Stan1
Stan2
Berk1
Berk2
CMU
Md Rezaul Huda Chowdhury Reza
Pros and ConsScalability
Routers do not maintain per-group stateEnd systems do, but they participate in very few groups
Potentially simplify support for higher level functionalityLeverage computation and storage of end systemsLeverage solutions for unicast congestion, error and flow control
Efficiency concernsredundant traffic on physical linksincrease in latency due to end-systems
Md Rezaul Huda Chowdhury Reza
Multicasting Algorithms
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
Requirements of Multipoint Routing Algorithms. Support reliable transmission
link failure should not increase delay or reduce resource availability.
Return optimal routes taking into consideration price to be paid (bandwidth consumed)end to end delay. (no. of links traversed)
Minimize network load.Avoid loops.Avoid traffic concentration on a few links or sub-nets.
Minimize the state stored in routers.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
Multipoint Routing algos
Performance MetricsQuality of a tree is judged according to the following three dimensions
Low Delay:End to end delay between source and receiver relative
to the shortest unicast path delay.Low Cost :
Cost of total bandwidth consumptionCost of tree state info
Light Traffic Concentration :Maximum number of flows on a unidirectional link.How evenly the routes are distributed.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
Routing AlgorithmsAll multi-point services use some kind of a distribution
tree.
Multicast trees can be
Shared across sources. (shared trees)Only one tree needs to be established for each group, which is
shared by all the sources within that group.
Source specific. (shortest path trees).A shortest path tree rooted at each sending node needs to be
established
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SOURCE BASED MULTIPOINT ROUTING
The Technique.A Source Rooted Shortest Path Tree (SRSPT) algo:
Computes the shortest paths between the source and each of the receivers within the group.
Eliminates duplicate data copies on common links.Maintains one SRSPT per sender.
Concept: All receiving nodes compute path towards the source independently.
Used by: current day IP multicast protocols as applications are stillsmall scale. local area.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SOURCE BASED MULTIPOINT ROUTING
Merits vs DemeritsAdvantages.
SRSPTs are easy to compute. Use the classic unicast routing tables.
Efficient distributed implementations are possible Entire global topology
not required. There can be no loops in
the path returned.
Disadvantages
Does not minimize total cost of distribution
Does not scale well. One piece of state
information per source and per group is kept in each router.
May fail badly if the underlying unicast routing is asymmetric.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SHARED TREE APPROACH OF MULTIPOINT ROUTING
Characteristics of Steiner Tree based algorithms.The Minimum Steiner Tree: The minimal cost subgraph spanning a given subset of nodes in a graph.
The Steiner Tree problem is NP-complete.finding the minimum steiner tree in a graph has exponential
cost.
The tree designed is undirected.solution feasible only for symmetric links.
Monolithic algorithm.has to be run each time group membership changes.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SHARED TREE APPROACH OF MULTIPOINT ROUTING
Characteristics of Steiner Tree based algorithms.
The SMT defines an absolute limit on the minimum tree cost to serve as a reference for gauging the cost-optimality of heuristic alternatives.
The SMT for all members of a multicast group is the same irrespective of the role of sender or receiver. only one state entry needs to be maintained per group. it scales well for larger groups.
The SMT may have unbounded delay. Worst case maximum end-to-end path length of a SMT can
be the longest acyclic path within the graph.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SHARED TREE APPROACH OF MULTIPOINT ROUTING
Characteristics of Core Based Tree algorithms. Concept:
Use the shortest Path Tree rooted at a node in the center of the network
Steps:Choose an optimal center for the group. Multiple cores
can be used for better fault tolerance & delay characteristics.
Group members send a join message to the center. Intermediate nodes mark interface from which the
multicast info is received and forward it to the center. Choose the center to:
minimize max/avg delay for all members on the tree.Minimize the sum of tree-link costs.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SHARED TREE APPROACH OF MULTIPOINT ROUTING
Advantages of Core Based Tree algorithms Work well with multiple senders/receivers
state information is stored per group, therefore scalable.
Receiver based approach.Supports dynamic group membership with relative
ease. Suitable for sparsely distributed receivers.
SPTs will not have many common links. Do not have the unbounded delay problems of SMTs. Simple to implement
used as the basis of PIM and of The CBT interdomain Routing Protocol.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
SHARED TREE APPROACH OF MULTIPOINT ROUTING
Disadvantages of Core Based Tree algorithms Incur extra delay as compared to the RPF approach.
Suffer from traffic concentration on links converging towards the center.
Choosing the optimal center is an NP complete problem.
Locating the center requires complete knowledge of the network topology.
Md Rezaul Huda Chowdhury Reza
Md Rezaul Huda Chowdhury Reza
MULTIPOINT ROUTING
TradeOffs between algos Any single tree cannot achieve Minimal Cost and
Minimal Delay both.Shortest Path Trees Minimize delay at expense of Cost.
Steiner Minimal Trees Minimize cost at expense of Delay.
Between these spectrum of different types of trees offering different tradeoffs.
Different strategies to place the routes results in different degrees of traffic concentration.
Md Rezaul Huda Chowdhury Reza