Introduction to Distributed Systems and Networking.

31
Introduction to Distributed Systems and Networking
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of Introduction to Distributed Systems and Networking.

Introduction to Distributed Systems and Networking

2

Announcements• Homework 4 due today

• Attempting to schedule Prelim II for Thursday, April 26th

3

Goals for today• Introduction to Distributed Systems

• Introduction to Networking

4

Centralized vs Distributed Systems

• Centralized System: System in which major functions are performed by a single physical computer

– Originally, everything on single computer– Later: client/server model

• Distributed System: physically separate computers working together on some task

– Early model: multiple servers working together• Probably in the same room or building• Often called a “cluster”

– Later models: peer-to-peer/wide-spread collaboration

Server

Client/Server ModelPeer-to-Peer Model

5

Distributed Systems

Definition:

Loosely coupled processors interconnected by network

• Distributed system is a piece of software that ensures:– Independent computers appear as a single coherent system

• Lamport: “A distributed system is a system where I can’t get my work done because a computer that I’ve never heard of has failed”

6

Why use distributed systems?• These are now a requirement:

– Economics dictate that we buy small computers– Cheap way to provide reliability– We all need to communicate– It is much easier to share resources– Allows a whole set of distributed applications– A whole set of future problems need machine communication

• Collaboration: Much easier for users to collaborate through network resources (such as network file systems)

– …

7

Distributed Systems: Issues• The promise of distributed systems:

– Higher availability: one machine goes down, use another– Better durability: store data in multiple locations– More security: each piece easier to make secure

• Reality has been disappointing– Worse availability: depend on every machine being up

• Lamport: “a distributed system is one where I can’t do work because some machine I’ve never heard of isn’t working!”

– Worse reliability: can lose data if any machine crashes– Worse security: anyone in world can break into system

• Coordination is more difficult– Must coordinate multiple copies of shared state information (using only a

network)– What would be easy in a centralized system becomes a lot more difficult

8

Distributed Systems Goals• Connecting resources and users• Transparency: the ability of the system to mask its complexity

behind a simple interface– Location: Can’t tell where resources are located– Migration: Resources may move without the user knowing– Replication: Can’t tell how many copies of resource exist– Concurrency: Can’t tell how many users there are– Parallelism: System may speed up large jobs by splitting them into

smaller pieces– Fault Tolerance: System may hide various things that go wrong in the

system• Openness: portability, interoperability• Scalability: size, geography, administrative• Transparency and collaboration require some way for

different processors to communicate with one another

9

Software ConceptsSystem Description Main Goal

Distributed OS Tightly coupled OS for multiprocessors and homogeneous m/cs

Hide and manage hardware resources

Networked OS Loosely coupled OS for heterogeneous computers,

LAN/WAN

Offer local services to remote clients

Middleware Additional layer atop NOS implementing general-purpose

services

Provide distribution transparency

Local OS Local OS Local OS

Middleware

Distributed Applications

Network

Machine A Machine B Machine C

10

Some Applications• Air traffic control

• Banking, stock markets

• Military applications

• Health care, hospital automation

• Telecommunications infrastructure

• E-commerce, e-cash

• …

11

Few Challenges• No shared clocks

– How to order events

• No shared memory– Inconsistent system state

• Scalability• Fault tolerance

– Availability, recoverability

• Consensus• Self management• Security

12

Networking• Middleware gives guarantees not provided by networking• How do you connect computers?

– Local area network (LAN)– Wide area network (WAN)

• Let us consider the example of the Internet

13

Internet: Example

• Click -> get page• specifies

- protocol (http) - location

(www.cnn.com)

14

Internet: Locating Resource• www.cnn.com

– name of a computer – Implicitly also a file (index.html)

• Map name to internet protocol (IP) address– Domain name system (DNS)

host local comcnn.com? cnn.com?

a.b.c.d a.b.c.d

15

Internet: Connection• Http (hyper-text transport protocol) sets up a connection

– TCP connection (transmission control protocol) – between the host and cnn.com to transfer the page

• The connection transfers page as a byte stream – without errors: flow control + error control

Host www.cnn.com

Connect

OK

Get page

Page; close

16

Internet: End-to-end

• Byte stream flows end to end across many links/switches:– routing (+ addressing)

• That stream is regulated and controlled by both ends:– retransmission of erroneous or missing bytes; flow control

HOST

CNN.COM

end-to-end pacing and error control

routing

17

Internet: Packets• The network transports bytes grouped into packets• Packets are “self-contained”; routers handle them 1 by 1• The end hosts worry about errors and pacing

– Destination sends ACKs; Source checks losses

C

HOST: B

CNN.COM: AA | B | # , CRC | bytes

B: toC

18

Internet: Bits• Equipment in each node sends packets as string of bits• That equipment is not aware of the meaning of the bits• Frames (packetizing) vs. streams

01011...011...110

Transmitter Physical Medium Receiver

01011...011...110

OpticalCopperWireless

19

Internet: Points to remember

• Separation of tasks– send bits on a link: transmitter/receiver [clock, modulation,…]– send packet on each hop [framing, error detection,…]– send packet end to end [addressing, routing]– pace transmissions [detect congestion]– retransmit erroneous or missing packets [acks, timeout]– find destination address from name [DNS]

• Scalability– routers don’t know full path– names and addresses are hierarchical

20

Internet : Challenges

• Addressing ?• Routing ?• Reliable transmission ?• Interoperability ?• Resource management ?• Quality of service ?

21

Concepts at heart of the Internet• Protocol• Layered Architecture• Packet Switching• Distributed Control• Open System

22

Protocol• Two communicating entities must agree on:

– Expected order and meaning of messages they exchange– The action to perform on sending/receiving a message

• Asking the time

23

Layered Architectures• Human beings can handle lots of complexity in their protocol

processing.– Ambiguously defined protocols– Many protocols all at once

• How computers manage complex protocol processing?– Specify well defined protocols to enact.– Decompose complicated jobs into layers;

• each has a well defined task

24

Layered Architectures

• Break-up design problem into smaller problems– More manageable

• Modular design: easy to extend/modify.• Difficult to implement

– careful with interaction of layers for efficiency

25

Layered Architecture

Web, e-mail, file transfer, ...

Reliable/ordered transmission, QOS,security, compression, ...

End-to-end transmission,resource allocation, routing, ...

Point-to-point links,LANs, radios, ...

Applications

Middleware

Routing

Physical Links

usersnetwork

26

The OSI Model• Open Systems Interconnect (OSI)

– standard way of understanding conceptual layers of network comm.– This is a model, nobody builds systems like this.

• Each level – provides certain functions and guarantees– communicates with the same level on remote notes.

• A message– generated at the highest level– is passed down the levels, encapsulated by lower levels– until it is sent over the wire.

• On the destination– Encapsulated message makes its way up the layers– until the high-level message reaches its high-level destination.

27

OSI Levels

Presentation

Transport

Network

Data Link

Physical

Application

Presentation

Transport

Network

Data Link

Physical

ApplicationNode A Node B

Network

Session Session

28

OSI Levels• Physical Layer

– electrical details of bits on the wire

• Data Link Layer– sending “frames” of bits and error detection

• Network Layer– routing packets to the destination

• Transport Layer– reliable transmission of messages, disassembly/assembly, ordering,

retransmission of lost packets

• Session Layer– really part of transport, typ. Not impl.

• Presentation Layer– data representation in the message

• Application– high-level protocols (mail, ftp, etc.)

29

Internet protocol stack

HTTP, SMTP, FTP, TELNET, DNS, …

TCP, UDP.

IP

Point-to-point links,LANs, radios, ...

Application

Transport

Network

Physical

usersnetwork

30

Air travel

Ticket (purchase)

Baggage (check)

Gates (load)

Runway (take off)

Passenger Origin

Ticket (complain)

Baggage (claim)

Gates (unload)

Runway (landing)

Passenger Destination

Airplane routing

31

Summary• Network: physical connection that allows two computers to

communicate– Packet: unit of transfer, sequence of bits carried over the network

• Protocol: Agreement between two parties as to how information is to be transmitted

• Internet Protocol (IP)– Used to route messages through routes across globe– 32-bit addresses, 16-bit ports

• Reliable, Ordered, Arbitrary-sized Messaging:– Built through protocol layering on top of unreliable,