Download - Unit 2 - rhezaariyanto.files.wordpress.com … · Web viewDistributed Applications Systems. Overview. A distributed system consists of several computers connected by a variety of

Unit 9

Distributed Applications Systems

OverviewA distributed system consists of several computers connected by a variety of transmission

media. Probably the best-known example is the World Wide Web. There are two kinds of

software associated with such systems: system software, which is used to manage and

administer the system (software that maintains a distributed file system, for example);

and application software, which is written to satisfy some commercial need. The

application software draws on facilities provided by the system software: for example, an

online retail site will use TCP/IP protocol facilities to carry out the process of sending

and receiving sales data.

This lesson concentrates on distributed application development and the technologies

used for implementation.

Introduction looks at the main features of distributed applications and the characteristics

of ecommerce systems. It describes the supporting infrastructure provided by the internet

and examines the structure and use of clients and servers. The introduction ends by

examining the different development paradigms that can be used for developing

distributed systems: message passing, distributed objects, event-based bus technologies

and space-based technologies.

Servers looks at the main features of the two most popular kinds of server found in

distributed application systems: web servers and database servers.

General issues considers such matters as the security of distributed systems and design

principles for distributed systems.

Lessons

1. Overview of the Distributed Application System.

2. Servers looks

3. General issues

Lesson 1 – Overview of the Distributed Application System

Objectives:

At the end of this lesson you will be able to:

Understand the Distributed Application System

Understand the main features of distributed application system and

characteristics of e-commerce systems

Understand supporting infrastructure provided by the internet and

examines the structure and use of clients and servers

Understand different development paradigms used for developing

distributed application system

What Is a Distributed Application System?

Distributed computing deals with hardware and software systems containing more than

one processing element or storage element, concurrent processes, or multiple programs,

running under a loosely or tightly controlled regime.

In distributed computing a program is split up into parts that run simultaneously on

multiple computers communicating over a network. Distributed computing is a form of

parallel computing, but parallel computing is most commonly used to describe program

parts running simultaneously on multiple processors in the same computer. Both types of

processing require dividing a program into parts that can run simultaneously, but

distributed programs often must deal with heterogeneous environments, network links of

varying latencies, and unpredictable failures in the network or the computers.

There are lots of advantages including the ability to connect remote users with remote

resources in an open and scalable way. When we say open, we mean each component is

continually open to interaction with other components. When we say scalable, we mean

the system can easily be altered to accommodate changes in the number of users,

resources and computing entities.

Thus, a distributed system can be much larger and more powerful given the combined

capabilities of the distributed components, than combinations of stand-alone systems. But

it's not easy - for a distributed system to be useful, it must be reliable. This is a difficult

goal to achieve because of the complexity of the interactions between simultaneously

running components

The distributed applications are based on possibility to send objects from one application

program to another and to allow an invocation by one application program of object

methods, which are located in another application program. The processing of user's

interaction with the database in realized in three levels: web browser, database client and

database server. The user's level contains web browser, which displays web pages and

collects information for processing. The middle level contains web server and application

server, which are the client programs for database. The lowest level contains database

servers.

Features of Distributed Application System

The technologies of the distributed applications design

The interest to distributed applications is explained by increased requirements to

modern program tools. The major of them are: application scalability - the capability

for an effective maintenance of any quantity of clients at the same time; application

reliability to the client application errors and communication failures; transactions

reliability - is the secure system junction in functioning process from one stable and

authentic state to another; long-term behavior - the nonstop run for 24 hours in a

week (24x7 run model); high security level of applications, which guarantees not only

the access control to different data, but also guarding in all stages of functioning; high

application development speed, simplicity of maintenance and possibility to modify

them by the programmers of a medium qualification.

At present time, there are some known technologies for static and dynamic distributed

applications realization, which meet the requirements, described above: socket

programming, RPC (Remote Procedure Call), DCOM (Microsoft Distributed

Component Object Model), CORBA (Common Object Request Broker Architecture)

and Java RMI (Java Remote Method Invocation). At the same time the most

important of them are the last three- DCOM, CORBA, Java RMI.

The DCOM technology is object-oriented which is supported by the next operation

systems: Windows 98, Windows NT, Windows 2000, Sun Solaris, Digital UNIX,

IBM MVS, etc. The most important merit of this technology consists in possibility to

integrate the applications, realized in different programming systems.

The CORBA technology is the part of OMA (Object Management Architecture),

developed to standardize the architecture and interaction interfaces of object-oriented

applications. The interfaces between the CORBA-objects encode using a special

language for interface definitions IDL (Interface Definition Language). Such

interfaces can be realized on any language of applied programming and connected to

CORBA-applications. In the context of standards it is proposed to connect CORBA-

object with DCOM-objects through the special CORBA-DCOM bridges.

The Java RMI applications consist of client and server as usual. Some objects are

created on server which can be transmitted through the network or methods which are

declared shared to remote application calls. On the client side there are realized

applications, which use remote objects. The distinguishing feature of RMI is

possibility to transmit through the network 4either methods or objects. This feature

provides, finally, mobility (portability) realization.

Today, the Java RMI and CORBA technologies are the most flexible and effective to

create distributed applications. These technologies are relatives by their features. The

major merit of CORBA is the IDL interface, which unifies communication tools

between the applications and the interoperability with other applications. On the other

hand, Java RMI is more flexible and powerful tool to distributed applications

development using the Java platform, including possibility of mobile applications

realization.

The Java RMI technology includes two constituents: Java language instrumental tools

and remote method invocation (RMI) to Java-objects. The Java language tools let to

create complex distributed network applications, which is possessed of high security

level and reliability, realize object-oriented programming, integrated multithreading

and platform independence. The RMI technology assigns a set of tools, which let to

get access to remote object on server by the special stub-object methods calls.

The Java RMI specification shows:

1. To set remote interfaces for classes whose methods can be called through the

network;

2. To create stub-objects using a special compiler;

3. To get a full copy of remote object, not only the reference to it;

4. To transmit objects in such a way that their behaviours will it not be changed

when they are transmitted to another virtual machine;

5. To register a server object in the RMI registry and to support that registry in

accessible state with the help of special background process. The clients can

access to this registry when they are looking for the needed objects;

6. To produce marshalling and de-marshalling with the help of API serialization

which translate object to a byte stream before transmitting and then, after

receiving, backwards;

7. To work over the IIOP protocol that gives a possibility to communicate with the

CORBA-objects (IIOP protocol can transmit a data of a different types, including

structures and unions, even if they contains reverse definitions).

The RMI technology organizes the programmable interface to work with the

networks unlike the TCP sockets. Such an interface has higher level, it is based on

method invocation and makes an impression allegedly the remote object is being

operated locally. RMI is more suitable and more natural than interface based on

sockets, however, it requires the Java programs execution on the both side of a

connection. The network connection, nevertheless, may use the same TCP/IP

protocol.

The network information technologies

There are three main parts in client-server technologies: user interface to display

information, realize graphical user interface and form a requests to server; functional

logic to realize required computing and business rules; database to execute selections,

modify data and process them in accordance with received commands.

Depending on ordering method of these components on client and server machines

can be 2-tier, 3-tier and n-tier client-server technologies for corporate systems

functioning.

The 3-tier technology with distributed services provides independent functional

relationships between the user interface, functional logic and database. The user

interface and the smaller part of a functional logic are located on client computers.

The bigger part of a functional logic is located on application server. The database

locates on database server. The applications are independent in the model with

distributed services. They interact through the networks with application server. This

interacts with database server on demand.

The object-oriented network technologies of distributed services connect in

themselves the models of a distributed databases and services. The software of such

kind of systems consists of a set of object units which interact between themselves

through the computer network using standard interfaces. This approach lets to use

units many times and to spend computer resources more economically. Each object,

depending on conditions, can be either client or server in this technology. The object

computing architecture, which based on distributed network services, presents a new

highly upcoming kind of computer technologies, which are widely uses to distributed

corporate systems' design.

The object-oriented approach is perspective to create dynamic web-oriented

applications. The problem of "super thin" clients realization is closely tied with them,

when the client-applied program uses the browser environment for execution. The

effectiveness of such technology for corporate systems' design is described by the

possibility to realize the standard HTML-browser in any operation environment. If

we'll take into account that the maintenance cost for one server and the maintenance

cost for a thousand of connected to it "thick" clients are not comparable then we can

make a conclusion: successfully realized corporate system sharply reduces overhead

charges for maintenance.

The component-oriented technologies

At present time, for the development of corporate systems with a distributed database

branches and remote services (functional logic) the most considerable are integral

technologies like an Active X/DCOM, RMI/CORBA, Enterprise Java Beans/CORBA

and CORBA/J2EE. The distributed component-oriented Active X/DCOM technology

is intended for application server development, registration and management by the

distributed program objects. The main its demerits are: in the first place, server

computer operates under Windows NT/2000 Server operation system only; and in the

second place, the tools of multi-user access for a several server’s operations

coordination which is managing by transactions are completely absent.

The remote access technology RMI organizes interconnection with a remote objects,

lets to develop a qualitative Internet-applications, possesses by all Java language

environment merits, provides an object-oriented programming, guarantees a high

security and reliability level, multithreading, multiplatform support, an independence

to operation system. The RMI built on the concept of call and its parameters

translation to a byte stream for transmits them through the network. The backward

operation is carried out on the server side, method invocation and result transmission

back to the client.

In the base of the common object request broker architecture lies an idea of uniting

multifinder applications in a common tooling environment, than lets to this

technology operate in heterogeneous systems, i.e. the network can serve, at the same

time, computers of a different types, operating under different operation systems and

serving applications written on different programming languages. The CORBA

technology provides development and maintenance of software for distributed

corporate systems with a maximal convenience. Its software is called CORBA-

application.

Each object requests broker ORB producer, which, practically, is the CORBA,

theoretically can propose its own protocol of transport service for data transmission.

In this case the expanded name of technology, for example CORBA/IIOP, which

reflects the name of the IIOP network protocol. With the aid of IIOP protocol any

CORBA/IIOP application can interact with other CORBA/IIOP-applications

independent of hardware, software and operation systems producers.

Characteristics of E-commerce System

Electronic commerce, a subset of e-business, is the purchasing, selling, and exchanging

of goods and services over computer networks (such as the Internet) through which

transactions or terms of sale are performed electronically. Ecommerce can be broken into

four main categories: Business-to-Consumer (B2C), B2B (Business-to-Business),

Business-to-Government (B2G) and Consumer-to-Consumer (C2C). Ecommerce offers a

wider choice of services and types of making transactions. The characteristics of

Ecommerce can be summarized as follows:

1. Business-oriented

Ecommerce is essentially business-oriented, as it is the purchasing, selling, and

exchanging of goods and services. Ecommerce expands market and brings more

customers. Online shopping is a more convenient way through which consumers get

what they want to buy. Regardless of the size of a business, Ecommerce means

opportunities to all.

2. Convenient service

In the unique Ecommerce environment, customers will no longer be confined by

geographical constraints in receiving services. The prominent feature of E-service is

convenience. Both consumers and businesses benefit from it.

3. System extendable

For Ecommerce, an extendable system is the guarantee of system stability. It is of

vital importance that the system must be extendable when a server traffic jam occurs

because a two-minute reset will result in loosing a huge number of customers.

4. Online safety

Online safety is the first priority of Ecommerce. Frauds, wiretapping, virus, and

illegal entries pose constant threats to Ecommerce. Therefore, an Internet safety

solution featuring end-to-end protection is called for. The solution may include

various protective measures such as encryption mechanisms, digital signature

mechanisms, firewalls, secure World Wide Web servers and antivirus protections.

5. Coordination

Ecommerce is a process of coordination between employees, customers,

manufacturers, suppliers and business partners. The traditional Ecommerce solution

can improve the internal coordination in a company, for example, via Emails. Thanks

to the booming Ecommerce, many new words and expressions are being created and

becoming familiar to the public. Typical examples include virtual businesses, virtual

banks, online shopping, online payment, online advertising and so on. In conclusion,

Ecommerce is displaying its charm to all.

Development Paradigm

1. Message passing

In computer science, message passing is a form of communication used in parallel

computing, object-oriented programming, and interprocess communication.

Communication is made by the sending of messages to recipients. Forms of messages

include function invocation, signals, and data packets. Prominent models of

computation based on message passing include the Actor model and the process

calculi which are a diverse family of related approaches to formally modeling

concurrent systems.

Microkernel operating systems pass messages between one kernel and one or more

server blocks.

Distributed object and remote method invocation systems like ONC RPC, Corba,

Java RMI, DCOM, SOAP, .NET Remoting, QNX Neutrino RTOS, OpenBinder, D-

Bus and similar are message passing systems. The term is also used in High

Performance Computing using Message Passing Interface.

The concept of message passing is also used in Bayesian inference over Graphical

models.

2. Distributed objects

Distributed objects are software modules that are designed to work together, but

reside either in multiple computers connected via a network or in different processes

inside the same computer. One object sends a message to another object in a remote

machine or process to perform some task. The results are sent back to the calling

object.

3. Event-based bus technologies

4. Space-based technologies

Lesson 2 – Server Looks

Objectives:


Understand the web servers

Understand the database servers.

Web servers

The term “Web server” is somewhat nebulous from a technical perspective, because it

can refer to multiple parts of a whole or the conglomeration of the parts. A Web server, in

its most basic form, is actually three distinctly different but equally important parts:

A physical machine connected to the Internet

http://en.wikipedia.org/w/index.php?title=QNX_Neutrino_RTOS&action=edit&redlink=1

A network operating system (NOS) that runs on the machine and manages all basic

networking functionality

Software that processes incoming HTTP requests from HTTP clients

In a more complex system, the server may also run special software (database engines,

transaction processors, site and server management tools, FTP or mail services, etc.) and

may be connected to specialized hardware (caching servers, firewalls, telephony

equipment, etc.). A server may actually consist of several physical machines working

together to provide the appearance of a single point of entry. When designing a Web

solution, the business need for all these components must be taken into consideration.

The two most popular Web servers for Internet hosts are Apache Web Server and

Microsoft Internet Information Server (IIS).

Apache HTTP Server

The Apache Web server was originally developed in 1995 by eight independent

developers who came together to create the Apache Project.

Using the source code for the NCSA HTTPd server, combined with numerous

patches and bug fixes, they founded the Apache Project as “a collaborative software

development effort aimed at creating a robust, commercial-grade, featureful and

freely-available source code implementation of an HTTP (Web) server.” Together

they founded the Apache Group and created the Apache Web server (“A PAtCHy

server).

Today the project is jointly managed by a group of volunteers located around the

world who use the Internet and Web to communicate, plan and develop the server and

its related documentation. These colunteers are known as the Apache Group. In

addition, hundreds of users have contributed ideas, code, and documentation to the

project.

One of the principle strengths of Apache is that it is free. All executables, the source

code, updates, patches, fixes, frequently asked questions (FAQs), and available

documentation cab be downloaded from the Apache Web site

(http://www.apache.org). The source code can be modified freely, so custom features

http://www.apache.org/

cab be added as necessary. The strong freeware mentality that exists in the UNIX

community (the platform for which Apache was originally developed) has helped

drive the development of Apache to a level equal to or surpassing many commercial

packages.

The major disadvantages of the Apache Web server is that it is difficult to configure.

Unlike products developed by Windows-oriented vendors, Apache does not have a

rich graphical environment in which to operate. Changes to the server must be made

by modifying the configuration files. This task can be daunting to those without a

strong UNIX background, and mistakes some of which can be devastating are easy to

make.

The dependence on a command-line interface, however, can also be one of Apache’s

greatest strengths. No other Web server provides the operator with as much control

over how the program operates. When performance is of utmost importance, Apache

is normally the server of choice.

Another area where Apache to lag behind the competition is in content support.

Apache comes with only basic Web page authoring tools, no content management

tools, a limited search engine, and no built-in SSL support. The open nature of the

product does allow development of add-on modules by third parties much easier than

for most other Web servers. These add-on modules, many of which are freely

available over the Internet, make Apache the most extensible Web server on the

market.

Microsoft IIS

Microsoft Internet Information Server (IIS) is free to all licensed users of Windows

NT and the two products are highly integrated. IIS 5.0 is now also included with

Windows 2000. this high level of integration makes IIS a good choice for companies

that already have a Windows NT network in operation. Companies that have

enterprise networks built on UNIX may not find IIS a very good choice because it

only runs on Windows NT and Microsoft has no plans to develop it for other

platforms.

One of the biggest selling features of IIS is its integration with a full line of other

Microsoft products. All of Microsoft’s BackOffice products, as well as FrontPage,

SQL Server, Proxy Server, Site Server, Systems Management Server, Certificate

Server, Transaction Server, Message Queue Server, Index Server, and Exchange

Server are designed to combine into a full enterprise networking solution.

Another advantage, or possible drawback to using IIS is the support that it provides to

proprietary Microsoft Web services. IIS is one of the few Web servers to support both

Active Server Pages and FrontPage extensions. These features allow users of IIS to

add dynamic content to their sites, which is not possible when using other Web

servers. Using them, however, may ultimately limit growth and choice because an all-

Microsoft solution may be necessary to support the Web site.

Database servers

A database server is a computer program that provides database services to other

computer programs or computers, as defined by the client-server model. The term may

also refer to a computer dedicated to running such a program. Database management

systems frequently provide database server functionality, and some DBMS's (e.g.,

MySQL) rely exclusively on the client-server model for database access.

In a master-slave model, database master servers are central and primary locations of data

while database slave servers are synchronized backups of the master acting as proxies.

Typically, client applications access database servers over a network.

Database servers are gaining importance because of the increasing popularity of the

client/server architecture model in computing. Database servers store the database on a

dedicated computer system, allow it to be accessed concurrently, maintain the integrity of

the data, and handle transaction support and user authorization.

A database server divides an application into a front end and a back end, in accordance

with the client-server model. The front end runs on the user’s computer and displays

requested data. The back end runs on the server and handles tasks such as data analysis

and storage.

Implementation of a database server

A database server can be implemented in a straightforward manner as separate node (on a

network) dedicated to running database-management software. This node provides an

interface to client nodes such that the same data is accessible to all nodes. The interface

allows users to submit requests to the database server and retrieve information. These

requests are typically made using a high-level query language such as SQL (standard

query language).

The server manages the any processor-intensive work such as data manipulation,

compilation, and optimization, and sends only the final results back to the client.

Database servers are typically made to run on a UNIX operating system.

Benefits of using a database server

A database server allows users to store data in one central location.

It performs complex functions such as searching, sorting, and indexing on the server

itself. This reduces network traffic because fewer items need to be transferred

between the client and the server.

Because data is stored centrally, there is enhanced security.

A database server uses its own processing power to find requested data, rather than

sending the complete data to the client so that the client searches for the data, as is

done in a file server.

A database server allows concurrent access to data.

Lesson 3 – General Issues

Objectives:


Understand the security for distributed systems

Understand the design principles for distributed systems

Security for distributed systems

Security involves protecting the system hardware and software from both internal attack

and from external attack (hackers). An internal attack normally involves uneducated users

causing damage, such as deleting important files, crashing systems. Another attack can

come from internal fraud, where employees may intentionally attack a system for their

own gain, or through some dislike for something within the organization. There are many

cases of users who have grudges against other users, causing damage to systems, by

misconfiguring systems. This effect can be minimized if the system manager properly

protects the system. Typical action are to limit the files that certain users can access and

also the actions they can perform on the system.

Most system manager have seen the following:

Users sending a file of the wrong format to the system printer (such as sending a

binary file). Another typical one is where there is a problem on a networked printer

(such as lack of paper), but the user keeps re-sending the same print job.

Users deleting the contents of sub-directories, or moving files from one place to

another (typically, these days, with the dragging of a mouse cursor). Regular backups

can reduce this problem.

Users deleting important system files (in a PC, these are normally AUTOEXEC.BAT

and CONFIG.SYS). This can be overcome by the system administrator protecting

important system file, such as making them read-only or hidden.

Users telling other people their user passwords or not changing a password from the

initial default one. This can be overcome by the system administrator forcing the user

to change their password at given time periods.

Security takes many forms as:

Data protection

This is typically where sensitive or commercially important information is kept. It

might include information databases, design files or source code files. One method of

reducing this risk is encrypt important files with a password, another is to encrypt

data with a secret electronic key (files are encrypted with a commonly known public

key, and decrypted with a secret key, which is only known by user who has the rights

to access the files).

Software Protection

This involves protecting all the software packages from damage or from being

misconfigured. A misconfigured software package can cause as much damage as

physical attack on a system, because it can take a long time to find the problem.

Physical system protection

This involves protecting systems from intruders who might physically attack the

systems. Normally, important systems are locked in rooms and then within locked

rack-mounted cabinets.

Transmission protection

This involves a hacker tampering with a transmission connection. It might involve

tapping into a network connection or total disconnection. Tapping can be avoided by

many methods, including using optical fibers which are almost impossible to tap into

(as it would typically involve sawing through a cable with hundreds of fiber cables,

which would each have to be connected back as they were connected initially).

Underground cables can avoid total disconnection, or its damage can be reduced by

having redundant paths (such as different connections to the Internet).

Using an audit log file

Many secure operating systems, such as Windows NT/2000, have an audit file, which

is a text file that the system maintains and updates daily. This is a text file that can

record all of the actions of a specific user, and is regularly update. It can include the

dates and times that a user logs into the system, the files that were accessed, the

programs that were run, and the networked resources that were used, and so on. By

examining this file the system administrator can detect malicious attacks on the

system, whether it is by internal or external users.

Distributed Application Design

Building a reliable system that runs over an unreliable communications network seems

like an impossible goal. We are forced to deal with uncertainty. A process knows its own

state, and it knows what state other processes were in recently. But the processes have no

way of knowing each other's current state. They lack the equivalent of shared memory.

They also lack accurate ways to detect failure, or to distinguish a local software/hardware

failure from a communication failure.

Distributed systems design is obviously a challenging endeavor. How do we do it when

we are not allowed to assume anything, and there are so many complexities? We start by

limiting the scope. We will focus on a particular type of distributed systems design, one

that uses a client-server model with mostly standard protocols. It turns out that these

standard protocols provide considerable help with the low-level details of reliable

network communications, which makes our job easier. Let's start by reviewing client-

server technology and the protocols.

In client-server applications, the server provides some service, such as processing database queries or sending out current stock prices. The client uses the service provided by the server, either displaying database query results to the user or making stock purchase recommendations to an investor. The communication that occurs between the client and the server must be reliable. That is, no data can be dropped and it must arrive on the client side in the same order in which the server sent it. There are many types of servers we encounter in a distributed system. For example, file servers manage disk storage units on which file systems reside. Database servers house databases and make them available to clients. Network name servers implement a mapping between a symbolic name or a service description and a value such as an IP address and port number for a process that provides the service.

In distributed systems, there can be many servers of a particular type, e.g., multiple file

servers or multiple network name servers. The term service is used to denote a set of

servers of a particular type. We say that a binding occurs when a process that needs to

access a service becomes associated with a particular server which provides the service.

There are many binding policies that define how a particular server is chosen. For

example, the policy could be based on locality (a Unix NIS client starts by looking first

for a server on its own machine); or it could be based on load balance (a CICS client is

bound in such a way that uniform responsiveness for all clients is attempted).

A distributed service may employ data replication, where a service maintains multiple

copies of data to permit local access at multiple locations, or to increase availability when

a server process may have crashed. Caching is a related concept and very common in

distributed systems. We say a process has cached data if it maintains a copy of the data

locally, for quick access if it is needed again. A cache hit is when a request is satisfied

from cached data, rather than from the primary service. For example, browsers use

document caching to speed up access to frequently used documents.

Caching is similar to replication, but cached data can become stale. Thus, there may need

to be a policy for validating a cached data item before using it. If a cache is actively

refreshed by the primary service, caching is identical to replication. [1]

As mentioned earlier, the communication between client and server needs to be reliable.

You have probably heard of TCP/IP before. The Internet Protocol (IP) suite is the set of

communication protocols that allow for communication on the Internet and most

commercial networks. The Transmission Control Protocol (TCP) is one of the core

protocols of this suite. Using TCP, clients and servers can create connections to one

another, over which they can exchange data in packets. The protocol guarantees reliable

and in-order delivery of data from sender to receiver.

The IP suite can be viewed as a set of layers, each layer having the property that it only

uses the functions of the layer below, and only exports functionality to the layer above. A

system that implements protocol behavior consisting of layers is known as a protocol

stack. Protocol stacks can be implemented either in hardware or software, or a mixture of

both. Typically, only the lower layers are implemented in hardware, with the higher

layers being implemented in software.

There are four layers in the IP suite:

1. Application Layer

The application layer is used by most programs that require network communication.

Data is passed down from the program in an application-specific format to the next

layer, then encapsulated into a transport layer protocol. Examples of applications are

HTTP, FTP or Telnet.

2. Transport Layer

The transport layer's responsibilities include end-to-end message transfer independent

of the underlying network, along with error control, fragmentation and flow control.

End-to-end message transmission at the transport layer can be categorized as either

connection-oriented (TCP) or connectionless (UDP). TCP is the more sophisticated

of the two protocols, providing reliable delivery. First, TCP ensures that the receiving

computer is ready to accept data. It uses a three-packet handshake in which both the

sender and receiver agree that they are ready to communicate. Second, TCP makes

sure that data gets to its destination. If the receiver doesn't acknowledge a particular

packet, TCP automatically retransmits the packet typically three times. If necessary,

TCP can also split large packets into smaller ones so that data can travel reliably

between source and destination. TCP drops duplicate packets and rearranges packets

that arrive out of sequence.

<>UDP is similar to TCP in that it is a protocol for sending and receiving packets

across a network, but with two major differences. First, it is connectionless. This

means that one program can send off a load of packets to another, but that's the end of

their relationship. The second might send some back to the first and the first might

send some more, but there's never a solid connection. UDP is also different from TCP

in that it doesn't provide any sort of guarantee that the receiver will receive the

packets that are sent in the right order. All that is guaranteed is the packet's contents.

This means it's a lot faster, because there's no extra overhead for error-checking

above the packet level. For this reason, games often use this protocol. In a game, if

one packet for updating a screen position goes missing, the player will just jerk a

little. The other packets will simply update the position, and the missing packet -

although making the movement a little rougher - won't change anything.

<>Although TCP is more reliable than UDP, the protocol is still at risk of failing in

many ways. TCP uses acknowledgements and retransmission to detect and repair

loss. But it cannot overcome longer communication outages that disconnect the

sender and receiver for long enough to defeat the retransmission strategy. The normal

maximum disconnection time is between 30 and 90 seconds. TCP could signal a

failure and give up when both end-points are fine. This is just one example of how

TCP can fail, even though it does provide some mitigating strategies.

3. Network Layer

As originally defined, the Network layer solves the problem of getting packets across

a single network. With the advent of the concept of internetworking, additional

functionality was added to this layer, namely getting data from a source network to a

destination network. This generally involves routing the packet across a network of

networks, e.g. the Internet. IP performs the basic task of getting packets of data from

source to destination.

4. Link Layer

The link layer deals with the physical transmission of data, and usually involves

placing frame headers and trailers on packets for travelling over the physical network

and dealing with physical components along the way.

Remote Procedure Calls

Many distributed systems were built using TCP/IP as the foundation for the

communication between components. Over time, an efficient method for clients to

interact with servers evolved called RPC, which means remote procedure call. It is a

powerful technique based on extending the notion of local procedure calling, so that the

called procedure may not exist in the same address space as the calling procedure. The

two processes may be on the same system, or they may be on different systems with a

network connecting them.

An RPC is similar to a function call. Like a function call, when an RPC is made, the

arguments are passed to the remote procedure and the caller waits for a response to be

returned. In the illustration below, the client makes a procedure call that sends a request

to the server. The client process waits until either a reply is received, or it times out.

When the request arrives at the server, it calls a dispatch routine that performs the

requested service, and sends the reply to the client. After the RPC call is completed, the

client process continues.

Threads are common in RPC-based distributed systems. Each incoming request to a

server typically spawns a new thread. A thread in the client typically issues an RPC and

then blocks (waits). When the reply is received, the client thread resumes execution.

A programmer writing RPC-based code does three things:

1. Specifies the protocol for client-server communication

2. Develops the client program

3. Develops the server program

The communication protocol is created by stubs generated by a protocol compiler. A stub

is a routine that doesn't actually do much other than declare itself and the parameters it

accepts. The stub contains just enough code to allow it to be compiled and linked.

The client and server programs must communicate via the procedures and data types

specified in the protocol. The server side registers the procedures that may be called by

the client and receives and returns data required for processing. The client side calls the

remote procedure, passes any required data and receives the returned data.

Thus, an RPC application uses classes generated by the stub generator to execute an RPC

and wait for it to finish. The programmer needs to supply classes on the server side that

provide the logic for handling an RPC request.

RPC introduces a set of error cases that are not present in local procedure programming.

For example, a binding error can occur when a server is not running when the client is

started. Version mismatches occur if a client was compiled against one version of a

server, but the server has now been updated to a newer version. A timeout can result from

a server crash, network problem, or a problem on a client computer.

Some RPC applications view these types of errors as unrecoverable. Fault-tolerant

systems, however, have alternate sources for critical services and fail-over from a

primary server to a backup server.

A challenging error-handling case occurs when a client needs to know the outcome of a

request in order to take the next step, after failure of a server. This can sometimes result

in incorrect actions and results. For example, suppose a client process requests a ticket-

selling server to check for a seat in the orchestra section of Carnegie Hall. If it's available,

the server records the request and the sale. But the request fails by timing out. Was the

seat available and the sale recorded? Even if there is a backup server to which the request

can be re-issued, there is a risk that the client will be sold two tickets, which is an

expensive mistake in Carnegie Hall .

Here are some common error conditions that need to be handled:

Network data loss resulting in retransmit: Often, a system tries to achieve 'at most

once' transmission tries. In the worst case, if duplicate transmissions occur, we try to

minimize any damage done by the data being received multiple time.

Server process crashes during RPC operation: If a server process crashes before it

completes its task, the system usually recovers correctly because the client will

initiate a retry request once the server has recovered. If the server crashes completing

the task but before the RPC reply is sent, duplicate requests sometimes result due to

client retries.

Client process crashes before receiving response: Client is restarted. Server

discards response data.

Fundamental Design Principles for Distributed System

We can define some fundamental design principles which every distributed system

designer and software engineer should know. Some of these may seem obvious, but it

will be helpful as we proceed to have a good starting list.

As Ken Arnold says: "You have to design distributed systems with the expectation of

failure." Avoid making assumptions that any component in the system is in a

particular state. A classic error scenario is for a process to send data to a process

running on a second machine. The process on the first machine receives some data

back and processes it, and then sends the results back to the second machine

assuming it is ready to receive. Any number of things could have failed in the interim

and the sending process must anticipate these possible failures.

Explicitly define failure scenarios and identify how likely each one might occur.

Make sure your code is thoroughly covered for the most likely ones.

Both clients and servers must be able to deal with unresponsive senders/receivers.

Think carefully about how much data you send over the network. Minimize traffic as

much as possible.

Latency is the time between initiating a request for data and the beginning of the

actual data transfer. Minimizing latency sometimes comes down to a question of

whether you should make many little calls/data transfers or one big call/data transfer.

The way to make this decision is to experiment. Do small tests to identify the best

compromise.

Don't assume that data sent across a network (or even sent from disk to disk in a rack)

is the same data when it arrives. If you must be sure, do checksums or validity checks

on data to verify that the data has not changed.

Caches and replication strategies are methods for dealing with state across

components. We try to minimize stateful components in distributed systems, but it's

challenging. State is something held in one place on behalf of a process that is in

another place, something that cannot be reconstructed by any other component. If it

can be reconstructed it's a cache. Caches can be helpful in mitigating the risks of

maintaining state across components. But cached data can become stale, so there may

need to be a policy for validating a cached data item before using it.

If a process stores information that can't be reconstructed, then problems arise. One

possible question is, "Are you now a single point of failure?" I have to talk to you

now - I can't talk to anyone else. So what happens if you go down? To deal with this

issue, you could be replicated. Replication strategies are also useful in mitigating the

risks of maintaining state. But there are challenges here too: What if I talk to one

replicant and modify some data, then I talk to another? Is that modification

guaranteed to have already arrived at the other? What happens if the network gets

partitioned and the replicants can't talk to each other? Can anybody proceed?

There are a set of tradeoffs in deciding how and where to maintain state, and when to

use caches and replication. It's more difficult to run small tests in these scenarios

because of the overhead in setting up the different mechanisms.

Be sensitive to speed and performance. Take time to determine which parts of your

system can have a significant impact on performance: Where are the bottlenecks and

why? Devise small tests you can do to evaluate alternatives. Profile and measure to

learn more. Talk to your colleagues about these alternatives and your results, and

decide on the best solution.

Retransmission is costly. It's important to experiment so you can tune the delay that

prompts a retransmission to be optimal.