HINDUSTAN COLLEGE OF SCIENCE & … of Computer Science & Engineering SUBJECT: DISTRIBUTED SYSTEM...

Distributed System [ECS-701] Solution Of PUT Paper_2012

of 34

HINDUSTAN COLLEGE OF SCIENCE & TECHNOLOGYDepartment of Computer Science & Engineering

SUBJECT: DISTRIBUTED SYSTEM (ECS-701)SOLUTION OF PAPER-(PUT)

SESSION: (2012-13)Time: 3hrs M.M.100

Note-Read each Question carefully.

SECTION –A

Q1. Attempt any two parts of the following - (2*10=20)

a) What are Distributed Systems? What are significant advantage & Limitations of distributedSystem? Explain with Example, what could be the impact of absence of global clock andshared memory?

ANS:- Distributed Systems- A distributed system is one in which components located atnetworked computers communicate and coordinate their actions only by passing messages.

OR

It is a no of autonomous processing elements(not necessarily homogenious) that areinterconnected by a Computer network & they cooperate in processing there assigned tasktranspareantly.

Advantages of Distributed System:-

Data sharing: allow many users to access to a common data base Resource Sharing: expensive peripherals like color printers Communication: enhance human-to-human communication, e.g., email, chat Flexibility: spread the workload over the available machines


of 34

Limitations of Distributed System:-

a. A distributed system is a set of computers that communicate over a network, and do notshare a common memory or a common clock.

b. Absence of a common (global) clocki. No concept of global time

ii. It’s difficult to reason about the temporal ordering of events1. Cooperation between processes(e.g., producer/consumer, client/server)2. Arrival of requests to the OS (e.g., for resources)3. Collecting up-to-date global state

c. It’s difficult to design and debug algorithms in a distributed system

Mutual exclusion Synchronization Deadlock

Impact: It is impossible to have a coherent global state.(explain with diagram)


of 34

b) Explain the Haung’s termination detection Algorithm in distributed system? Explain withSuitable example?

ANS: - Haung’s termination detection-

ExampleThe picture shows a process P0, designated the controlling agent, with W0 = 1. It asksP1 and P2 to do some computation. It sets W1 to 0.2, W2 to 0.3 and W3 to 0.5. P2 in turn asks P3and P4 to do some computations. It sets W3 to 0.1 and W4 to 0.1.When P3 terminates, it sends C(W3) = C(0.1) to P2, which changes W2 to 0.1 + 0.1 = 0.2.


of 34

c) What do you mean by casual ordering of Messages? Give any algorithm which can order themessages according to casual dependencies?

ANS: - Casual ordering of Messages-

Birman-Schiper-Stephenson Protocol (BSS)-Introduction-The goal of this protocol is to preserve ordering in the sending of messages. For example, ifsend(m1) -> send(m2), then for all processes that receive both m1 and m2, receive(m1) ->receive(m2). The basic idea is that m2 is not given to the process until m1 is given. Thismeans a buffer is needed for pending deliveries. Also, each message has an associated vectorthat contains information for the recipient to determine if another message preceded it. Also,we shall assume all messages are broadcast. Clocks are updated only when messages are sent.Notation

n processes Pi process


of 34

Ci vector clock associated with process Pi; jth element is Ci[j] and contains Pi's latest valuefor the current time in process Pj.

tm vector timestamp for message m (stamped after local clock is incremented).Protocol

Pi sends a message to PjPi increments Ci[i] and sets the timestamp tm = Ci[i] for message m. 1.Pj receives a message from PiWhen Pj, j != i, receives m with timestamp tm, it delays the message's deliveryuntil both:Cj[i] = tm[i] - 1; and a.for all k <= n and k != i, Cj[k] <= tm[k]. b.1.When the message is delivered to Pj, update Pj's vector clock 2.Check buffered messages tosee if any can be delivered.


of 34

SECTION –B

Q2. Attempt any two parts of the following- (2*10=20)

a) Explain the Chandy-Mishra-Hass Edge Chasing Algorithm for Distributed DeadlockDetection with given Example?

ANS: - Chandy-Mishra-Hass Edge Chasing Algorithm-It is a fully distributed deadlock detection algorithm is given by Chandy, Misra, and Hass (1983).

This is considered an edge-chasing, probe-based algorithm.

It is also considered one of the best deadlock detection algorithms for distributed systems.

If a process makes a request for a resource which fails or times out,


of 34

the process generates a probe message and sends it to each of the processes holding one or more of itsrequested resources.

Each probe message contains the following information:

I. the id of the process that is blocked (the one that initiates the probe message);II. the id of the process is sending this particular version of the probe message; and

III. the id of the process that should receive this probe message.

When a process receives a probe message,

it checks to see if it is also waiting for resources.

If not, it is currently using the needed resource

and will eventually finish and release the resource.

If it is waiting for resources, it passes on the probe message to all processes it knows to be holdingresources it has itself requested.

The process first modifies the probe message, changing the sender and receiver ids.

If a process receives a probe message that it recognizes as having initiated,

it knows there is a cycle in the system

and thus, deadlock.

The following example is based on the same data used in the Silberschatz-Galvin algorithm example.

In this case P1 initiates the probe message, so that all the messages shown have P1 as the initiator.

When the probe message is received by process P3, it modifies it and sends it to two more processes.

Eventually, the probe message returns to process P1. Deadlock!


of 34

The advantages of this algorithm include the following: It is easy to implement. Each probe message is of fixed length. There is very little computation. There is very little overhead. There is no need to construct a graph, nor to pass graph information to other sites. This algorithm does not find false (phantom) deadlock. There is no need for special data structures.

Solution:-


of 34


of 34

b) What is Token Based Algorithms? Give the Suzuki and Kasami’s Algorithm with SuitableExamples?

ANS:- Token Based Algorithms-

one token, shared among all sites site can enter its CS iff it holds token The major difference is the way the token is searched use sequence numbers instead of timestamps

o used to distinguish requests from same siteo kept independently for each site

The proof of mutual exclusion is trivial The proof of other issues (deadlock and starvation) may be less so

Suzuki and Kasami’s Algorithm-


of 34


of 34

c) Answers any two part of the FollowingsI. Performance metrics for distributed mutual exclusion algorithm?

ANS- Performance metrics for distributed mutual exclusion algorithm-


of 34


of 34

II. Lamport’s Algorithm for Mutual Exclusion?ANS- Lamport’s Algorithm for Mutual Exclusion-


of 34


of 34

III. Explain the Requirements of ME Algorithms?

ANS:- Requirements of ME Algorithms(Explain Each)


of 34

SECTION –C

Q3. Attempt any two parts of the following- (2*10=20)

a) Explain the Classification of Agreement Problem? What is the Relationship among theAgreement Problems?

ANS:- Classification of Agreement Problem-


of 34


of 34

b) What is DFS? Explain Mechanism & Design Issues for Building Distributed File System?

ANS: - Distributed File System-

Introduction

A resource management component in a distributed operating system. A common file system that can be shared by all computers in the system. Usually use some dedicate file servers

Goals:

Network transparency: users should not be aware of the location of files. High availability: system failures and regular shared activities (backups) should

not result in unavailability of files.

Architecture

File servers and file clients interconnected by a communication network. Two most important components: name server and cache manager. name server: map names to stored objects (files, directories) Cache manager: perform file caching. Can present on both servers and clients.

o Cache on the client deals with network latencyo Cache on the server deals with disk latency


of 34

Typical steps to access data

Check client cache, if present, return data Check local disk, if present, load into local cache, return data Send request to file server


of 34

... server checks cache, if present, load into client cache, return data disk read load into server cache load into client cache return data

Mechanism & Design Issues for Building Distributed File System-

Mechanisms:

Mounting: binding together different filename spaces to form a singlehierarchically structured name space.

o The mount table maintains the mapping from mount points to storagedevices.

o Can be used in distributed file systems to do name resolution. Where tomountain the mount information? at the clients: (NFS) different clients may see different files at the server: all clients may see the same filename space. Good

when files move around the servers. Caching: a common technique to exploit locality and reduce delays. Hints: caching without cache coherence. Cache coherence operations too

expensive in the distributed systems (too many communications). Hints work wellfor applications that can recover from invalid data (e.g. address mapping)

Bulk Data Transfer: communication cost = S + C*B, S: startup cost, mostlysoftware, C: per byte cost, B: number of bytes to send.

Encryption: typical method to enforce security.

Design Issues

Naming and name resolutiono A name is associated with an object.o Name resolution: mapping a name to an objecto Name space: a collection of names. Three different methods to name files:

concatenate the host name to the name of the file on that host simple unique file name not network transparency Moving a file from one machine to another may require

modifications to the application. name resolution is simple -- don't need a name server

mount remote directories onto local directory Once mounted, accessing the file becomes location

transparent Name resolution is simple.

have a single global directory: all files in the system belong to asingle name space


of 34

not scalable, how to maintain a system wide unique filenameso Name server

name resolution: just maintaining a table that maps names toobjects

Centralized server or distributed server (applications query localname server, local name servers query remote name server -- adistributed database).

Caches on Disk or Main Memoryo Advantage for caching on main memory

Can be used in diskless workstation accessing a cache in memory is much faster Same techniques for server cache and client cache.

o Disadvantage compete with virtual memory Virtual memory system more complex: data cannot be in both vm

and in cache. Large files cannot be cached completely: need to be able to chop

the file, server software more complex.o Advantage for caching on disk

Large file can be cached. virtual memory is simple

Writing policyo Write through -- reliable, but not cache for writeo delayed write: delay the writes to the server

access locality: some date can be deleted right away -- 20-30%data are erased within 30 seconds.

may lose data when client crashes.o write on close

not much difference from delay write for short opens. for long open: less write but more susceptible to losing data

Cache consistencyo server-initiated: the server inform cache managers whenever the data in

the client caches become staleo client-initiated: the client make sure the cache is clean before returning

the data to the applicationo Both server-initiated and client-initiated cache consistency schemes are

expensive.o Not allow caching when concurrent-write sharing occurs.o sequential-write sharing may also cause problems: a client opens a files

that has recently modified and closed by another client Outdate blocks in the cache: timestamp the cache copy.

Contacting the server can find the inconsistency. Update data in another client's cache not in the server (delay

write): flushing writes when other clients open.


of 34

Availabilityo replication is the main mechanismo how to keep replicas consistento how to detect inconsistencies among replicaso consistency problem may decrease the availabilityo replica management: voting mechanism to read and write to replica

Scalabilityo How to meet the demand of a growing system?o The biggest hurdle: consistency issue.

Caching: reduce network load and server load server-initiated cache invalidation to maintain cache

consistency: complexity increases as the system sizeincreases.

exploiting the mode for file use: most shared files are read only. Semantics

o What a user wants? strict consistency.o Users can usually tolerate a certain degree of errors in file handling -- no

need to enforce strict consistency.

c) Answers any One part of the Followings-I. Explain the Different Algorithms for implementing Distributed Shared Memory?

ANS: - Algorithms for implementing DSM-

The central-Server Algorithmo A central-server maintains all the shared data.o for read: the server just return the datao for write: update the data and send acknowledge to the cliento a simple working solution to provide shared memory for distributed

applicationso low efficiency -- bottleneck at the server, long memory access latencyo Data can be distributed -- need a directory to store the location of a page.

The migration algorithmo Data is shipped to the location of the data access request -- subsequent

accesses are localo For both read/write: get the remote page to the local machine, then

perform the operation.o Keeping track of memory location: location service, home machine for

each page, broadcast.o Problems: thrashing -- pages move between nodes frequently, false

sharingo Multiple reads can be costly.

The Read-Replication Algorithmo On read, a page is replicated (mark the page as multiple reader)


of 34

o On write, all copies except one must be updated or invalidatedo multiple read one writeo Allowing multiple readers to a pageo All the location must be kept track of: location service/home machine

The Full Replication Algorithmo Allow multiple read and multiple writeo Must control the access to shared memoryo Practical????


of 34


of 34

II. Explain the Advantages & Design Issues of Distributed Shared Memory?

ANS:-

Advantages of DSM:-


of 34

Design Issues of Distributed Shared Memory-

Granularity

Granularity: size of the shared memory unit The page size is usually a multiple of the size provided by the underlying

hardware and memory management system. large page size -- more locality, less communication overheads, more

contention, more false sharing Separate the unit of replication and the unit for coherence maintenance.

Page replacement

Least recently used (LRU) may not be appropriate -- data can be accessed indifferent mode: shared, private, read-only, writable, etc.

Replacement policy needs to take access modes into consideration.e.g. Privatedata should be replaced before shared data. Read-only page can just bedeleted.

SECTION –D

Q4. Attempt any two parts of the following- (2*10=20)a) Explain Recovery in Replicated Distributed Database System? How the system functions under the

Normal failure and Recovery conditions?

ANS: - Recovery in Replicated Distributed Database System-


of 34


Page 1 of 16

a) Explain The 2PC Commit Protocol? Explain Different Conditions with site failure in 2PC?

ANS:- 2PC Commit Protocol-

NEERAJ

Typewritten text

b


Page 2 of 16


Page 3 of 16


Page 4 of 16

EXAMPLE-


Page 5 of 16


Page 6 of 16

b) Explain the Voting Protocol? How the Vote Assignment done in Static and Dynamic VotingExplain with Example?

ANS: - Voting Protocol-


Page 7 of 16


Page 8 of 16


Page 9 of 16


Page 10 of 16


Page 11 of 16

c) Answers all part of the Followings-I. Issues in fault-Tolerance?

ANS:- Issues in fault-Tolerance-


Page 12 of 16

II. Method to obtain consistent set of checkpoint?

ANS: - Method to obtain consistent set of checkpoint-


Page 13 of 16

III. Fault-Tolerant Clock Synchronization with Interactive Convergence Algorithm?

ANS: Fault-Tolerant Clock Synchronization with Interactive Convergence Algorithm-


Page 14 of 16


Page 15 of 16


Page 16 of 16


Page 1 of 12

SECTION –EQ5. Attempt any four parts of the following- (4*5=20)

a) What is Non-Two-Phase locking? How it is different from Strict-2PL?

ANS: - Non-Two-Phase locking-


Page 2 of 12

b) Explain Multiversion Timestamp Ordering Algorithm with Example?

ANS:- Multiversion Timestamp Ordering Algorithm-


Page 3 of 12

c) Explain the Problems of Concurrency control in distributed transactions with Examples?

ANS:- Problems of Concurrency control in distributed transactions:-


Page 4 of 12


Page 5 of 12

d) How Does Distributed Deadlock affects the Transactions Explain with given Example?


Page 6 of 12


Page 7 of 12

e) What is Replication? Explain the Passive-Replication with example?

ANS:- Replication-


Page 8 of 12


Page 9 of 12


Page 10 of 12

f) Write short notes on two of the Followings-I. 2PC for Nested transactions?


Page 11 of 12

ANS: - 2PC for Nested transactions-


Page 12 of 12

II. Gossip Architecture?

ANS:-

III. Bayou & Operational transformational approach?

ANS: - Bayou & Operational transformational approach:-

*************************************BEST OF LUCK*************************************

HINDUSTAN COLLEGE OF SCIENCE & … of Computer Science & Engineering SUBJECT: DISTRIBUTED SYSTEM...

Documents

Transcript of HINDUSTAN COLLEGE OF SCIENCE & … of Computer Science & Engineering SUBJECT: DISTRIBUTED SYSTEM...