Chapter 1 introduction
-
Upload
tamrat-amare -
Category
Education
-
view
444 -
download
0
Transcript of Chapter 1 introduction
![Page 1: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/1.jpg)
Chapter 1 - Introduction
![Page 2: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/2.jpg)
2
1.1 Introduction and Definitionbefore the mid-80s, computers were
very expensive (hundred of thousands or even millions of dollars)very slow (a few thousand instructions per second)not connected among themselves
after the mid-80s: two major developmentscheap and powerful microprocessor-based computers appearedcomputer networks
LANs at speeds ranging from 10 to 1000 Mbps (now even 10Gbps)WANs at speed ranging from 64 Kbps to gigabits/sec
consequencefeasibility of using a large network of computers to work for the same application; this is in contrast to the old centralized systems where there was a single computer with its peripherals
![Page 3: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/3.jpg)
3
Definition of a Distributed Systema distributed system is:a collection of independent computers that appears to its users as a single coherent system - computer (Tanenbaum& Van Steen)
this definition has two aspects:1. hardware: autonomous machines2. software: a single system view for the users
![Page 4: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/4.jpg)
4
Other DefinitionsA distributed system is a system designed to support the development of applications and services which can exploit a physical architecture consisting of multiple, autonomous processing elements that do not share primary memory but cooperate by sending asynchronous messages over a communication network (Blair & Stefani)
A distributed system is one that stops you getting any work done when a machine you have never even heard of crashes (Leslie)
![Page 5: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/5.jpg)
5
Why Distributed?Resource and Data Sharing
printers, databases, multimedia servers, ... Availability, Reliability
the loss of some instances can be hidden Scalability, Extensibility
the system grows with demand (e.g., extra servers) Performance
huge power (CPU, memory, ...) available Inherent distribution, communication
organizational distribution, e-mail, video
![Page 6: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/6.jpg)
6
Problems of Distribution Concurrency, Security
clients must not disturb each other Privacy
e.g., when building a preference profile such as using cookies unwanted communication such as spam
Partial failure we often do not know where the error is (e.g., RPC)
Location, Migration, Relocation, Replication clients must be able to find their servers
Heterogeneity hardware, platforms, languages, management
![Page 7: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/7.jpg)
7
Characteristics of Distributed Systemsdifferences between the computers and the ways they communicate are hidden from usersusers and applications can interact with a distributed system in a consistent and uniform way regardless of locationdistributed systems should be easy to expand and scalea distributed system is normally continuously available, even if there may be partial failures
![Page 8: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/8.jpg)
8
1.2 Goals of a Distributed Systemto support heterogeneous computers and networks and to provide a single-system view, a distributed system is often organized by means of a layer of software called middleware that extends over multiple machines
a distributed system organized as middleware; note that the middleware layer extends over multiple machines, and offers each application the
same interface
Ack: most diagrams in all slides are taken from the textbook
![Page 9: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/9.jpg)
9
a distributed system should easily connect users with resources (printers, computers, storage facilities, data, files, Web pages, ...)
Some of the reasonseconomics: sharing resources such as printers and high-speed computersto collaborate and exchange informationgroupware: software for collaborative editing, teleconferencing, etc.e-commerce: buying and selling goods
be transparent: hide the fact that the resources and processes are distributed across multiple computersbe openbe scalable
Transparency in a Distributed Systema distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent
![Page 10: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/10.jpg)
10
different forms of transparency in a distributed systemTransparency DescriptionAccess Hide differences in data representation
(endianness, file naming, ...) and how a resourceis accessed
Location Hide where a resource is physically located; whereis http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another locationRelocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their wireless laptops and moving from place to place
Replication Hide that a resource is replicated (for availabilityand performance); all replicas have the same name
Concurrency Hide that a resource may be shared by several competitive users; a resource must be left in a consistent state; through locking
Failure Hide the failure and recovery of a resourceBut trying to achieve all distribution transparency may be impossible or may not be a good idea
![Page 11: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/11.jpg)
11
Openness in a Distributed Systema distributed system should be openwe need well-defined interfaces interoperability
components of different origin can communicate portability
components work on different platforms another goal of an open distributed system is that it should be flexible and extensible; easy to configure the system out of different components; easy to add new components, replace existing ones; easier said than donean Open Distributed System is a system that offers services according to standard rules that describe the syntax and semantics of those services; e.g., protocols in networksstandards - a necessity
![Page 12: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/12.jpg)
12
Scalability in Distributed Systemsa distributed system should be scalable; there are three dimensions
size: adding more users and resources to the systemgeographically: users and resources may be far apartadministratively: should be easy to manage even if it spans many administrative organizations
but a scalable system may exhibit performance problems
in distributed systems, such services are often specified through interfaces often described using an Interface Definition Language (IDL)
specify only syntax: the names of the functions, types of parameters, return values, possible exceptions, ...semantics given in an informal way by means of natural languages
![Page 13: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/13.jpg)
13
Concept Example
Centralized services Single server for all users-mostly for security reasons
Centralized data A single on-line telephone book
Centralized algorithms Doing routing based on complete information
examples of scalability limitations
scalability problems leading to low performance
Scaling Techniques: how to solve scaling problemsthe problem is mainly performance, and arises as a result of limitations in the capacity of servers and networks (for geographical scalability with high latency and mostly unreliable links)three possible solutions: hiding communication latencies, distribution, and replication
![Page 14: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/14.jpg)
14
a. Hide Communication Latenciestry to avoid waiting for responses to remote service requestslet the requester do other useful jobi.e., construct requesting applications that use only asynchronous communication instead of synchronouscommunication; when a reply arrives the application is interruptedgood for batch processing and parallel applicationssince independent tasks can be scheduled while another task is waiting for communication to complete or use multithreading for non-parallel programshiding communication latencies is not in general applicable for interactive applicationsfor interactive applications, try to reduce communication; move part of the job to the client to reduce communication; e.g. filling a form to access a database and checking the entries
![Page 15: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/15.jpg)
15
(a) a server checking the correctness of field entries(b) a client doing the job
e.g., checking the completeness of mandatory fieldsshipping code is now supported in Web applications using Java Applets and ActiveX controls (with some security issues)
![Page 16: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/16.jpg)
16
b. Distributionmeans splitting a component into smaller parts and spreading those parts across the systeme.g., DNS - Domain Name System ([email protected])divide the name space into nonoverlapping zonesfor details, see later in Chapter 5 - Naming
an example of dividing the DNS name space into zones
![Page 17: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/17.jpg)
17
c. Replicationreplicate components across a distributed system to increase availability and for load balancing, leading to better performancereplication is decided by the owner of a resourcecaching (a special form of replication) also reduces communication latency; decided by the userbut, caching and replication may lead to consistencyproblems (see Chapter 7 - Consistency and Replication)
![Page 18: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/18.jpg)
18
Pitfalls when Developing Distributed Systemsbecause of false assumptions made by first time developers (of distributed systems) which are related to the properties of distributed systems and do not occur in nondistributed applications
The network is reliable (making it difficult to achieve failure transparency)The network is secureThe network is homogeneousThe topology does not changeLatency is zeroBandwidth is infiniteTransport cost is zeroThere is one administrator
![Page 19: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/19.jpg)
19
1.3 Types of Distributed SystemsThree types: distributed computing systems, distributedinformation systems, and distributed pervasive/embeddedsystems
1. Distributed Computing SystemsUsed for high-performance computing taskstwo types: cluster computing and grid computingCluster Computing
a collection of similar workstations or PCs (homogeneous), closely connected by means of a high-speed LANeach node runs the same operating systemused for parallel programming in which a single compute intensive program is run in parallel on multiple machines
![Page 20: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/20.jpg)
20
an example of a cluster computing system
a master node runs a middleware (containing libraries for parallel programs) and controls other compute nodes; it
allocates tasksprovides an interface to usersetc.
![Page 21: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/21.jpg)
21
Grid Computing“Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster)high degree of heterogeneity: no assumptions are made concerning hardware, operating systems, networks, administrative domains, security policies, etc.Globus is a software system for Grid Computing; read about the Globus Alliance at http://www.globus.org/
2. Distributed Information Systemsproblem: many networked applications with a problem of interoperabilityat the lowest level: wrap a number of requests into a single larger request and have it executed as a distributed transaction; all or none of the requests would be executedhow to let applications communicate directly with each other, i.e., Enterprise Application Integration (EAI)
![Page 22: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/22.jpg)
22
a. Transaction Processing Systemsconsider database applicationsspecial primitives are required to program transactions, supplied either by the underlying distributed system or by the language runtime systemexact list of primitives depends on the type of application; procedure calls, ordinary statements, etc. can also be included
Primitive DescriptionBEGIN_TRANSACTION Mark the start of a transaction
END_TRANSACTION Terminate the transaction and try to commit
ABORT_TRANSACTION Kill the transaction and restore the old values
READ Read data from a file, a table, or otherwise
WRITE Write data to a file, a table, or otherwise
![Page 23: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/23.jpg)
23
The Transaction Modelthe model for transactions comes from the world of businessa supplier and a retailer negotiate on
pricedelivery datequalityetc.
until the deal is concluded they can continue negotiating or one of them can terminatebut once they have reached an agreement they are bound by law to carry out their part of the dealtransactions between processes is similar with this scenario
![Page 24: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/24.jpg)
24
e.g., assume the following banking operationwithdraw an amount x from account 1deposit the amount x to account 2
what happens if there is a problem after the first activity is carried out?group the two operations into one transaction; either both are carried out or neitherwe need a way to roll back when a transaction is not completed
![Page 25: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/25.jpg)
25
(a) transaction to reserve three flights commits(b) transaction aborts when third flight is unavailable
BEGIN_TRANSACTIONreserve Man → Heathrow;reserve Heathrow → Bole;reserve Bole → Lalibella;
END_TRANSACTION(a)
BEGIN_TRANSACTIONreserve Man → Heathrow;reserve Heathrow → Bole;reserve Bole → Lalibella full ⇒
ABORT_TRANSACTION(b)
e.g. reserving a seat from Manchester to Lalibella through Heathrow and AA Bole airports
![Page 26: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/26.jpg)
26
properties of transactions, often referred to as ACID1. Atomic: to the outside world, the transaction happens
indivisibly; a transaction either happens completely or not at all; intermediate states are not seen by other processes
2. Consistent: the transaction does not violate system invariants; e.g., in an internal transfer in a bank, the amount of money in the bank must be the same as it was before the transfer (the law of conservation of money); this may be violated for a brief period of time, but not seen to other processes
3. Isolated or Serializable: concurrent transactions do not interfere with each other; if two or more transactions are running at the same time, the final result must look as though all transactions run sequentially in some order
4. Durable: once a transaction commits, the changes are permanent; see later in Chapter 8 - Fault Tolerance
![Page 27: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/27.jpg)
27
Classification of Transactionsa transaction could be flat, nested or distributedFlat Transaction
consists of a series of operations that satisfy the ACID propertiessimple and widely used but with some limitations
do not allow partial results to be committed or abortedi.e., atomicity is also partly a weaknessin our airline reservation example, we may want to accept the first two reservations and find an alternative one for the last
some transactions may take too much time
![Page 28: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/28.jpg)
28
Nested Transactionconstructed from a number of subtransactions; it is logically decomposed into a hierarchy of subtransactions; the flight reservation can be split into three transactions, each accessing a different databasethe top-level transaction forks off children that run in parallel, on different machines; to gain performance or for programming simplicityeach may also execute one or more subtransactionspermanence (durability) applies only to the top-level transaction; commits by children should be undone
Distributed Transactiona flat transaction that operates on data that are distributed across multiple machinesproblem: separate algorithms are needed to handle the locking of data and committing the entire transaction; see later in Chapter 8 for distributed commit
![Page 29: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/29.jpg)
29
(a) a nested transaction(b) a distributed transaction
![Page 30: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/30.jpg)
30
b. Enterprise Application Integrationhow to integrate applications independent from their databasestransaction systems rely on request/replyhow can applications communicate with each other; by means of a middleware
middleware as a communication facilitator in enterprise application integration
![Page 31: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/31.jpg)
31
there are different communication modelsRPC (Remote procedure Call)RMI (Remote Method Invocation)MOM (Message-Oriented Middleware)Stream-Oriented CommunicationMulticast Communication
see later in Chapter 4 - Communication
3. Distributed Pervasive Systemsthe distributed systems discussed so far are characterized by their stability; fixed nodes having high-quality connection to a networkthere are also mobile and embedded computing devices which are small, battery-powered, mobile, and with a wireless connection
![Page 32: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/32.jpg)
32
three requirements for pervasive applicationsembrace contextual changes: a device is aware that its environment (location, identities of nearby people and objects, time of the day, season, temperature, etc.) may change all the time, e.g., by changing its network access point; hence its operations and services must be adapted to the current contextencourage ad hoc composition: devices are used in different ways by different usersrecognize sharing as the default: devices join a system to access or provide information
examples of pervasive systemsHome Systems that integrate consumer electronicsElectronic Health Care Systems to monitor the well-being of individualsSensor Networksread pages 26 - 30
![Page 33: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/33.jpg)
33
[ DiversionDifferent approaches to distribution - Lost in the forest of distribution
Distributed systemN autonomous computers (sites): n administrators, n data/control flowsan interconnection networkuser view: one single (virtual) system(traditional) programmer view: client-server
Parallel system1 computer, n nodes: one administrator, one scheduler, one power sourcememory: it dependsprogrammer view: one single machine executing parallel codes; various programming models (message passing, distributed shared memory, …)
![Page 34: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/34.jpg)
34
Cluster computinguse of PCs interconnected by a (high performance) network as a parallel (cheap) machine
Network computingfrom LAN (cluster) computing to WAN computingset of machines distributed over a MAN/WAN that are used to execute parallel loosely coupled codesdepending on the infrastructure, network computing comes in many flavours: grid computing, P2P, Internet computing, etc.
Grid computing“Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster)
Peer-to-peer computinga site is both client and server
![Page 35: Chapter 1 introduction](https://reader033.fdocuments.net/reader033/viewer/2022060107/55497f48b4c905b96a8b4689/html5/thumbnails/35.jpg)
35
application: mostly file sharing, but also others like Internet Telephony (Skype)2 approaches:
centralized management: Napsterdistributed management: Gnutella, Kazaa
Internet Computinguse of (idle) computers interconnected by Internet for processing large throughput applicationsprogrammer view: a single master, n servants
Cloud Computinga general term for anything that involves delivering hosted services over the Interneta model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction
]