CS716 Advanced Computer Networks - api.ning.com · Advanced Computer Networks By Dr. Amir Qayyum....

1

CS716

Advanced Computer Networks

By Dr. Amir Qayyum

Lecture No. 1

3

Instructor’s Introduction

• Ph.D. Mobile Wireless Net., Univ. of Paris-Sud, France– Elect. Engg.,U.E.T. Lahore– M.S. Comp. Engg., E.S.I.M., France– D.E.A. Parallel Comp. Arch., Univ. of Paris-Sud, France

• Active participant of MANET wg of IETF since 1997– Co-author of an RFC on routing protocol for MANETs

• Associated with INRIA, France as a research fellow– Worked on Praxitele, PRIMA and IPANEMA projects

• Protocol implement. experience at Enabling Technologies– Implementation of RTP-RTCP/UDP-TCP/IP stack for a RISC

based packet processor, including CRTP, PPP/MP, IPsec, QoS.

• Currently working at CARE Pvt. Ltd. and CASE– Design/development of long-range, secure mobile ad hoc networks

4

Significance and Rationale of course

• Networks and telecommunication is getting more and more importance

• Future telecomm networks will be more oriented toward “networks” rather than “communication”

• Widespread Internet, diffused in our daily life is a ground reality; its beneficial to understand it

• Its fun to play with protocols (software) and able to design exciting new type of networks

5

Foundation Course in Network Stream

• This Computer Networks course is a foundation course for “Networks” as the Area of Specialization

• It is a required pre-requisite course for the following advanced level courses in networking– Performance Analysis of Communication Networks

– Integrated Services over Packet Networks

– Computer Network Security

– Mobile Networking

• It is a recommended pre-requisite for the course:– Network and System Programming

6

Course Material

• Reference books– Many textbooks on Networking may be consulted

• Lot of research papers!– Many will be made available on course web site

• RFCs and Internet drafts– Related to TCP/IP suite and other protocols

• Web resources– Tutorials, white papers, reports, etc.

7

Text Book

• Larry L. Peterson and Bruce S. Davies, Computer Networks: A Systems Approach– Third Edition [2003], Morgan Kaufmann

Publishers, San Mateo, California, USA

• W. Richard Stevens, UNIX Network Programming, Volume 1, (Networking APIs: Sockets and XTI)– Second Edition, Prentice Hall

– Recommended ONLY for programming part

8

Prerequisites

• Required:

• An undergraduate level course on computer networks

• Good knowledge of C language, preferably in Unix/Linux– e.g. C and Data Structures course

9

Course Evaluation

• Assignments

• Quizzes

• Research Papers

• Term Projects / Reports

• Midterm exam

• Final exam

10

Expectations

What do you want (or expect) to learn from

this course ?

11

Expectations

• This course IS about …

– Network principles and concepts

– General purpose computer networks

– Internet perspective• Major components of the Internet protocol suite

– Network software

– Designing and building a system

12

Expectations

• This course IS NOT about …

– Survey of existing protocol standards

– Specialized networks (e.g. CATV, telephone)

– OSI perspective

– Network hardware• Data transmission on physical layer

– Queuing theory (we do survey, if time permits)

13

Expectations

We will learn

why

networks are like they are

14

Objectives: Principles and Concepts

• At the end of this course, you should be able to:– Identify the problems that arise in networked

communication

– Explain advantages/disadvantages of existing solutionsto these problems in different networking scenarios

– Evaluate novel approaches to these problems

– Understand the components of Internet protocol suite

– Understand the implications of a given solution for performance in various networking environments

15

Objectives: Programming

• At the end of this course, you should be able to:– Identify and describe the purpose of each component

of the TCP/IP protocol suite

– Develop client-server applications using TCP/IP

– Understand the impact of trends in network hardware on network software issues

– Understand over 1000 useful (or useless) VUAs

16

Course Contents

• Overview• Introduction to network programming• Direct link networks• Packet switching• Internetworking• End-to-end protocols• Congestion control and resource allocation• End-to-end data• Applications• Performance analysis and queuing theory (?)

17

Introduction

• Outline– Statistical Multiplexing– Inter-Process Communication– Performance Metrics– Network Architecture

18

What is Your Over-ambitious Goal ?

• Build a computer network which– Can grow to global proportions

– Support diverse applications

• Then … think about

– Underlying building blocks• Which available technologies to use

– Integrating the blocks to communicate• Which software architecture to use

Network Design

Before looking inside a computer network, first agree on what a computer network is?

20

Computer Network ?

• Set of serial lines to attach terminals to mainframe ?

• Telephone network carrying voice traffic ?

• Cable network to disseminate video signals ?

Specialized to handle:

Keystrokes

Voice

Video

21

What Distinguishes a Computer Network ?

• Generality• Built from general purpose

programmable hardware• Supports wide range of applications

22

Applications – Users’ Contact with

the Network• Most people know the Internet through its

applications– Web, email, streaming audio and video, chat, …

• Applications present an intuitively simple interface– Textual and graphical objects– Simple “clicks” to maneuver the application

• However, users are not aware of what happens in the network with their simple “clicks” !!!

23

Applications – Consumers of Networks

• On a simple click, several messages may be exchanged over the Internet

• In a web browser, 17 messages may be exchanged– up to six messages to translate the server name

– three messages to set up a TCP connection

– four messages to send HTTP “get” request + response

– four messages to tear down the TCP connection

• Moreover, millions of messages are exchanged each day by Internet nodes to make their presence and services known

24

Applications – the Driving Force

• Streaming audio and video is an emerging application– Source generates and sends the video stream in messages across

the Internet

• Video-on-demand: reads a preexisting movie– One-way data transfer

• Videoconferencing: interactive session– Very tight timing constraints

• Diversity of applications that can be built on top of the Internet hint at the complexity of the Internet design

25

Our Road Map …

• Fortunately, we are not the first to build a computer network

• Lets start exploring the path that others have already dig deep

• By asking (and answering) why networks are designed the way they are

26

Network Overview• What must a network provide ?

– Connectivity– Cost-effective sharing– Functionality– Performance

• How are networks designed and built ?– Layering– Protocols– Standards

27

Perspective

• For network user– Connectivity: for services required; error

free delivery within acceptable time limits

• For network designer– Efficiency: cost-effective design, fair

allocation and efficient use of resources

• For network operator– Maintenance: easy to administer, fault

localization & isolation, usage accounting

28

Building Blocks

• Nodes: PC, special-purpose hardware…– Hosts– Switches

• Links: coax cable, optical fiber…

– Point-to-point

– Multiple access …

29

Why not connect each node with every other node ?

• Number of computers that can be connected becomes very limited

• Number of wires coming out of each node becomes unmanageable

• Amount of physical hardware/devices required becomes very expensive

• Solution: indirect connectivity using intermediate data forwarding nodes

30

Switched Networks

– Two or more nodes connected by a link

– Circular nodes (switches) implementthe network

– Squared nodes (hosts) use the network

• A network can be defined recursively as...

31

Switched Networks

– Two or more networks connected by one or more nodes: internetworks

– Circular nodes (router or gateway) interconnectsthe networks

– A cloud denotes “any type of independent network”

• A network can be defined recursively as...

32

A Network

A network can be defined recursively as

Two or more nodes connected by a physical link

OR

Two or more networks connected by one or more nodes

33

Components of a Network

34

Switching Strategies

• Circuit switching:carry bit streamsa. establishes a dedicated

circuitb. links reserved for use

by communication channel

c. send/receive bit stream at constant rate

d. example: original telephone network

• Packet switching: store-and-forward messagesa. operates on discrete

blocks of datab. utilizes resources

dynamically according to traffic demand

c. send/receive messages at variable rate

d. example: Internet

35

What next ?

• Hosts are directly or indirectly connected to each other

• Can we now provide host-host connectivity ?

• Nodes must be able to say which host it wants to communicate with

36

Addressing and Routing

• Address: byte-string that identifies a node– Usually unique

• Routing: forwarding decisions– Process of determining how to forward messages to the

destination node based on its address

• Types of addresses– unicast: node-specific

– broadcast: all nodes on a network

– multicast: some subset of nodes on a network

37

Wrap-up

• A network can be constructed from nestingof networks

• An address is required for each node that is reachable on the network

• Address is used to route messages toward appropriate destination

1

CS716


By Dr. Amir Qayyum

Lecture No. 2

3

Multiplexing• Physical links/switches must be shared among users

– (synchronous) Time-Division Multiplexing (TDM)

– Frequency-Division Multiplexing (FDM)

L1

L2

L3

R1

R2

R3Switch 1 Switch 2

Multiple flows on a single link

Do you see any problem with TDM / FDM ?

4

Statistical Multiplexing

• On-demand time-division, possibly synchronous (ATM)

• Schedule link on a per-packet basis

• Buffer packets in switches that are contending for the link

• Packets from different sources interleaved on link

…

Do you see any problem ?

5

Statistical Multiplexing• An application needs to break-up its message in packets,

and re-assemble at the receiver• Fair allocation of link capacity: FIFO, round-robin or QoS• If congestion occurs at a switch - buffer may overflow,

packets may be lost

…

6

Enough … ?

• A network is delivering packets among a collection of computers

• How application processes communicate in a meaningful way ?

• Hide network complexity by implementing the common services once

7

Inter-Process Communication• Turn host-to-host connectivity into process-to-process

communication, making the communication meaningful.• Fill gap between what applications expect and what the

underlying technology provides.

Host Host

Application

Host

Application

Host Host

Channel Abstraction for application-level communication

8

IPC Abstractions

• Semantics and interface depend on applications

• Request/Reply– distributed file systems

• file servers (FTP)

– digital libraries / HTTP

• information retrieval

• Message stream– video on-demand

– video conferencing• delay sensitive

• two-way frame flow

• 1/4 NTSC = 352x240 pixels

• (352 x 240 x 24)/8=247.5KB

• 30 fps = 7500KBps = 60Mbps

• 10fps + compression < 10Mbps

• reliable ?

• prioritized ?

• delay/bandwidth guarantees ?

9

Abstract Channel Functionality

• What functionality does a channel provide ?– Smallest set of abstract channel types adequate

for largest number of applications

• Where the functionality is implemented ?– Network as a simple bit-pipe with all high-level

communication semantics at the hosts

– More intelligent switches allowing hosts to be “dumb” devices (telephone network)

10

What Goes Wrong in the Network?

Reliability at stake

• Bit-level errors (electrical interference)• Packet-level errors (congestion)

– distinction between lost and late packet

• Link and node failures– distinction between broken and flaky link– distinction between failed and slow node

11

What Goes Undesirable in the Network?

Required performance at stake

• Messages are delayed

• Messages are delivered out-of-order

• Third party’s eavesdrop

• The challenge is to fill the gap between application expectations and hardware capabilities

12

Performance Metrics

• … and to do so while delivering “good” performance

• Bandwidth (throughput)– data transmitted per unit time, e.g. 10 Mbps– link bandwidth versus end-to-end bandwidth– notation

• KB = 210 bytes• Kbps = 103 bits per second

13

Performance Metrics

1 second

(a)

1 second

(b)

1Mbps, 1µs/bit

2Mbps, 0.5µs/bit

Bandwidth related to “bit width”

14

Performance Metrics• Latency / delay

– time to send message from point A to point B– one-way versus round-trip time (RTT)– components

Latency = Propagation + Transmit + QueuePropagation = Distance / cTransmit = Size / Bandwidth

• Note:• No queuing delay in direct (point-to-point) link• Bandwidth irrelevant if size = 1 bit• Process-to-process latency includes software processing overhead

(dominates over shorter distances)

15

Bandwidth versus Latency

• Relative importance, depends on application• 1-byte character:

– Choice of 1ms vs 100ms dominates 1Mbps vs 100Mbps

• 25MB file:– Choice of 1Mbps vs 100Mbps dominates 1ms vs

100ms

• Large data (file transfer) is bandwidth critical• Small data (HTTP) is latency critical

16

Bandwidth versus Latency10,000

5000

2000

1000

500

200

100

50

20

10

5

2

1

10010RTT (ms)

1-MB object, 1.5-Mbps link

1-MB object, 10-Mbps link

2-KB object, 1.5-Mbps link

2-KB object, 10-Mbps link

1-byte object, 1.5-Mbps link

1-byte object, 10-Mbps linkPer

ceiv

ed la

tenc

y (m

s)

17

Delay x Bandwidth Product

• Amount of data “in flight” or “in the pipe”

• Example: 100ms RTT x 45Mbps BW = 560KB

• This much data must be buffered before the sender responds to slowdown the request

Bandwidth

Delay

18

Infinite Bandwidth

• Latency (RTT) dominates instead of throughput– Throughput = TransferSize / TransferTime– TransferTime = RTT + 1/Bandwidth x TransferSize

• 1 MB file over a 1 Mbps network takes around 8 sec– With RTT of 100ms, it corresponds to 80 RTTs– Effective throughput is 1MB/8.1s = 0.987Mbps

• 1 MB file over a 1 Gbps network takes 100ms + 8ms– Effective throughput is 1MB/108ms = 74.1 Mbps

• 1-MB file to 1-Gbps link apears like a 1-KB packet to 1-Mbps link

19

Bandwidth Requirements

• Request/reply type applications may agree for as much bandwidth as is available

• Message stream type applications often require a certain bandwidth: not more not less

• Average required bandwidth does not always help to design a network– Upper bound on a burst (peak-rate transfer)

• Queuing delay introduces variation in latency (jitter)

20

Wrap-up

• Established a comprehensive set of requirements for network design

• Networks evolve to accommodate changes in underlying technologies and user demands

• However, hardware and user expectations are moving targets …

21

Network Architecture

• The challenge is to fill the gap between hardware capabilities and application expectations, and to do so while delivering “good” performance

• Designers cope with this complex task by developing a network architecture as a guideline– Layering, protocols, standards

22

Layering• Use abstractions to hide complexity• Abstractions naturally lead to layering• Each layer provides some functionality

Hardware

Host-to-host connectivity

Process-to-process channels

Application programs

23

Layering• Alternative abstractions at each layer• Manageable network components• Modify layers independently

Hardware

Host-to-host connectivity

Application programs

Request/replychannel

Message streamchannel

24

Protocols

• Building blocks of a network architecture

• Each protocol object has two different interfaces– service interface: operations on this protocol

– peer-to-peer interface: messages exchanged with peer

• Term “protocol” is overloaded

– specification of peer-to-peer interface– module that implements this interface– peer modules are interoperable if both accurately

follow the specifications

25

Host 1 Host 2

Service

interface

Peer-to-peer

interface

Protocol Interfaces

High-levelobject

High-levelobject

ProtocolProtocol

26

Protocol Graph – Network Architecture• Collection of protocols and their dependencies

– most peer-to-peer communication is indirect– peer-to-peer is direct only at hardware level

Host 1 Host 2

Fileapplication

Digitallibrary

application

Videoapplication

Fileapplication

Digitallibrary

application

Videoapplication

RRP RRPMSP MSP

HHP HHP

RRP: Request Reply Protocol

MSP: Message Stream Protocol

HHP: Host-to-Host Protocol

1

CS716


By Dr. Amir Qayyum

Lecture No. 3

Protocol Machinery

Multiplexing and Demultiplexing (demux key)Encapsulation (header/body) in peer-to-peer interfaces

indirect communication (except at hardware level)each protocol adds a header

part of header includes demultiplexing field (e.g., pass up to request/reply or to message stream?)

4

Encapsulation

Host 1 Host 2

Applicationprogram

Applicationprogram

Data Data

RRP RRP

RRP Data RRP Data

HHP HHP

RRP DataHHP

5

Message Transmission Using Layers

6

Standard Architectures

• Open System Interconnect (OSI) Architecture– International Standards Organization (ISO)

– International Telecommunications Union (ITU), formerly CCITT

– “X dot” series: X.25, X.400, X.500

– Primarily a reference model

7

OSI Architecture

Application

Presentation

Session

Transport

End host

One or more nodeswithin the network

Network

Data link

Physical

Network

Data link

Physical

Network

Data link

Physical

Application

Presentation

Session

Transport

End host

Network

Data link

Physical

Application

Data formatting

Connection management

Process-to-process communication channel

Host-to-host packet delivery

Framing of data bits

Transmission of raw bits

User level

OS kernel

8

Internet Architecture

• TCP/IP Architecture– Developed with ARPANET and NSFNET

– Internet Engineering Task Force (IETF)

• Culture: implement, then standardize

• OSI culture: standardize, then implement

– Became popular with release of Berkeley Software Distribution (BSD) Unix; i.e. free software

– Standard suggestions traditionally debated publically through “Request For Comments” (RFC’s)

9


• Implementation and design done together• Hourglass Design (bottleneck is IP)• Application vs Application Protocol (FTP, HTTP)

… NETnNET2NET1

IP

TCP UDP

FTP HTTP NV TFTP

10


• Layering is not very strict

Application

TCP UDP

IP

Network

11

Network Models

12

How Layers Fit Together in Practice

13

Networking in the Internet Age

14

Protocol Acronyms• (T)FTP – (Trivial) File Transfer Protocol

• HTTP – Hyper Text Transport Protocol

• NV – Network Video

• SMTP – Simple Mail Transfer Protocol

• NTP – Network Time Protocol

• TCP – Transmission Control Protocol

• UDP – User Datagram Protocol

• IP – Internet Protocol

• FDDI – Fiber Distributed Data Interface

• ATM – Asynchronous Transfer Mode

15

Elements of a Protocol Implementation

• Outline– Service Interface– Process Model– Common Subroutines– Example Protocol

16

Network Software

• Major factors for runaway success of the Internet:– most functionalities provided by software running on

general-purpose computers

• new services can be added readily with just a small matter of programming

• Understanding how to implement network software is essential to understand computer networks

17

Network Application Programming Interface (API)

• Interface that the OS provides to its networking subsystem– most network protocols are implemented in software

– all systems implement network protocols as part of the OS

– each OS is free to define its own network API

– applications can be ported from one OS to another if APIs are similar

• *IF* application program does not interact with other parts of the OS other than the network (file system, fork processes, display …)

18

Protocols and API

• Protocols provide a certain set of services

• API provides a syntax by which those services can be invoked

• Implementation is responsible for mapping API syntax onto protocol services

19

Socket API• Use sockets as “abstract endpoints” of

communication• Issues

– Creating & identifying sockets– Sending & receiving data

• Mechanisms– UNIX system calls and library routines

socket

process

20

Socket API

• Creating a socketint socket(int domain, int type, int

protocol)

• domain (family) = AF_INET, PF_UNIX, AF_OSI

• type = SOCK_STREAM, SOCK_DGRAM

• protocol = TCP, UDP, UNSPEC

• return value is a handle for the newly created socket

21

Sockets (cont)

• Passive Open (on server)int bind(int socket, struct sockaddr *addr, int

addr_len)

int listen(int socket, int backlog)

int accept(int socket, struct sockaddr *addr, int addr_len)

• Active Open (on client)int connect(int socket, struct sockaddr *addr,

int addr_len)

22

Sockets (cont)

• Sending Messagesint send(int socket, char *msg, int mlen, int flags)

• Receiving Messagesint recv(int socket, char *buf, int blen, int flags)

23

Protocol-to-Protocol Interface• A protocol interacts with a lower level

protocol like an application interacts with underlying network

• Why not using available network APIs for PPI ?– Inefficiencies built into the socket interface

• application programmer tolerate them to simplify their task

– inefficiency at one level

• protocol implementers do not tolerate them– inefficiencies at several layers of protocols

24

Protocol-to-Protocol Interface Issues

• Configure multiple layers– static versus extensible

• Process Model– avoid context switches

• Buffer Model– avoid data copies

25

Process Model

(a) (b)Process-per-Protocol Process-per-Message

inter-process communication

procedure call

26

Process Model (cont.)

IP

TCP

send(IP, message) deliver(TCP, message)

int send(Protocol llp, Msg *message)

int deliver(Protocol hlp, Msg *message)

27

Buffer Model

send()deliver()

Topmost protocol

Application process

buffer copy buffer copy

28

Message Library

• Add header

abcdefg

bcopy (“ xyz” , hdr , 3);msgAddHdr(m, hdr, 3);

xyzabcdefg

m

m

abcdefg

hdr = msgStripHdr(m, 3);

defg + hdr = “ abc”

m

m

• Strip header

29

Message Library (cont)

• Fragment message

new

m

m

abcdefg

defg + abc

msgFragment (m, new, 3);

m1 m2

new

abcd efg

msgReassemble(new, m1, m2)

• Reassemble messages

abcdefg

30

Network Programming

• Things to learn– Internet protocols (IP, TCP, UDP, …)

– Sockets API (Application Programming Interface)

• Why IP and sockets• allows a common name space across most of

Internet

– IP (Internet Protocol) is standard

• reduces number of translations, which incur overhead

– Sockets: reasonably simple and elegant Unix interface (most servers run Unix)

31

OSI Model Internet Protocols

…physical

data link

network

transport

application layers TFTPNVHTTPFTP

UDPTCP

IP

FDDI Ethernet ATM

32

• learn to use Internet for communication (with focus on implementation of networking concepts)

• learn to build network from ground up

physical

data link

network

transport

Programming

Principles and concepts

33

Socket Programming

• Reading: Stevens 2nd edition, Chapter 1-6

• Sockets API: a transport layer service interface– introduced in 1981 by BSD 4.1

– implemented as library and/or system calls– similar interfaces to TCP and UDP

– can also serve as interface to IP (for super-user); known as “raw sockets”

– Linux also provides interface to MAC layer (for super-user); known as “data-link sockets”

34

Outline

• Client-server model

• TCP connections

• UDP services

• Addresses and data

• Sockets API

• Example of usage

35

Client-Server Model

• Asymmetric relationship

• Server/daemon– well-known name

– waits for contact

– process requests, sends replies

• Client– initiates contact

– waits for response

server

client client

client

36

Client-Server Model• Bidirectional communication channel• Service models

– sequential: server processes only one client’s requests at a time

– concurrent: server processes multiple clients’ requests simultaneously

– hybrid: server maintains multiple connections, but processes requests sequentially

• Server and client categories not disjoint– server can be client of another server– server as client of its own client (peer-to-peer architecture)

37

Review Lecture 3

• Protocol Machinery - Encapsulation

• Standard architectures: OSI, Internet

• Network models and layering

• Elements of protocol implementation

• Network software

• Network API – Socket API – PPI

• Process and buffer model

• Programming – Client server model

1

CS716


By Dr. Amir Qayyum

Lecture No. 4

3

TCP Connections

• Transmission Control Protocol, at OSI transport layer

• Recall: each protocol provides service interface

4

Aspects of TCP Service

• Transfers a stream of bytes (interpreted by application)

• Connection-oriented– set up connection before communicating

– tear down connection when done

• In-order delivery of data: if A sends M1 followed by M2 to B, B never receives M2 before M1

5


• Reliable– data delivered at most once

– exactly once if no catastrophic failures

• Flow control– prevents senders from wasting

bandwidth

– reduces global congestion problems

6


• Full-duplex: send or receive data at any time

• 16-bit port space allows multiple connections on a single host

7

TCP Connections

• TCP connection setup via 3-way handshake– J and K are sequence numbers for messages

client server

SYN J

SYN K

ACK J+1

ACK K+1Hmmm …

RTT is important!

8

TCP Connections

• TCP connection teardown (4 steps) (either client or server can initiate connection teardown)

client server

FIN J

FIN K

ACK K+1

ACK J+1

active close

passive close

closes connection

Hmmm …

Latencymatters!

9

UDP Services

• User Datagram Protocol, at OSI transport layer

• Thin layer over IP

10

UDP - Aspects of Services

• Unit of transfer is a datagram (variable length packet)

• Unreliable, drops packets silently

• No ordering guarantees

• No flow control

• 16-bit port space (distinct from TCP ports) allows multiple recipients on a single host

11

Addresses and Data

• Internet domain names: human readable– mnemonic

– variable length• e.g., mail.yahoo.com, www.vu.edu.pk (FQDN)

• IP addresses: easily handled by routers/computers– fixed length

– tied (loosely) to geography• e.g., 128.93.0.4 or 212.0.0.1

12

Endianness

• Machines on Internet have different endianness

• Little-endian (Intel, DEC): least significant byte of word stored in lowest memory address

• Big-endian (Sun, SGI, HP): most significant byte...

13

Endianness

• Network byte order is big-endian

• Use of network byte order– imperative for some data (e.g., IP addresses)

– good form for all binary data (e.g., application-specific)

– ASCII/Unicode are acceptable alternatives

14

Endianness• 16/32 bit conversion (for platform independence)

int m, n; // int32

short int s, t; // int16

m = ntohl(n) // net-to-host long (32-bit) translation

s = ntohs(t) // net-to-host short (16-bit) translation

n = htonl(m) // host-to-net long (32-bit) translation

t = htons(s) // host-to-net short (16-bit) translation

15

Socket Address Structures• Socket address structures (all fields in network byte order

except sin_family)

IP addressstruct in_addr {

in_addr_t s_addr; /* 32-bit IP address */

};

TCP or UDP addressstruct sockaddr_in {

short sin_family; /* e.g., AF_INET */

ushort sin_port; /* TCP / UDP port */

struct in_addr; /* IP address */

};

16

Address Conversion• All binary values used and returned by these functions

are network byte ordered

struct hostent* gethostbyname (const char* hostname);

translates English host name to IP address (uses DNS)

struct hostent* gethostbyaddr (const char* addr, size_t len, int family);

translates IP address to English host name (not secure)

int gethostname (char* name, size_t namelen);

reads host’s name (use with gethostbyname to find local IP)

17

Address Conversionin_addr_t inet_addr (const char* strptr);

translate dotted-decimal notation to IP address; returns –1 on failure, thus cannot handle broadcast value “255.255.255.255”

int inet_aton (const char* strptr, struct in_addr inaddr);

translate dotted-decimal notation to IP address; returns 1 on success, 0 on failure

char* inet_ntoa (struct in_addr inaddr);

translate IP address to ASCII dotted-decimal notation (e.g., “128.32.36.37”); not thread-safe

18

Sockets API

• Basic Unix concepts

• Creation and setup

• Establishing a connection (TCP only)

• Sending and receiving data

• Tearing down a connection (TCP only)

• Advanced sockets

19

Basic UNIX Concepts – I/O

• Per-process table of I/O channels

• Table entries can describe files, sockets, devices, pipes, etc.

• Unifies I/O interface

• Table entry/index into table called “file descriptor”

20

Basic UNIX Concepts

• Error model

• “standardization” of return value

– 0 on success, -1 on failure

– NULL on failure for routines returning pointers

•errno variable

21

Client-Server Connection

client server

my-machinemail.yahoo.com

I am mail.yahoo.com,

port b

I accept connections

I will talk to my-machine,

port a

Talk to mail.yahoo.com,

port b

Resulting TCP connection identified by(my-machine:a, mail.yahoo.com:b)

22

Client-Server ConnectionI am

mail.yahoo.com, port b

I accept connections

I will talk to my-machine,

port a

Talk to mail.yahoo.com,

port b1. socket()2. bind()

3. listen()

4. socket()5. connect()

6. accept()

7. send() / sendto()8. recv() / recvfrom()

9. close() / shutdown()

client server

my-machinemail.yahoo.com

23

Socket Creation and Setup• int socket (int family, int type, int protocol);

Create a socket. Returns file descriptor or -1.

• int bind (int sockfd, struct sockaddr* myaddr, int addrlen);Bind a socket to a local IP address and port number.

• int listen (int sockfd, int backlog);Put socket into passive state (wait for connections rather than initiate

a connection).

24

Creating Sockets - socket()int socket (int family, int type, int protocol);

Create a socket. Returns file descriptor or -1. Also sets errno on failure.

family: address family (namespace) or protocol family– AF_INET for IPv4– other possibilities: AF_INET6 (IPv6), AF_UNIX, AF_OSI or

AF_LOCAL (Unix socket), AF_ROUTE (routing)

type: style of communication– SOCK_STREAM for TCP (with AF_INET)– SOCK_DGRAM for UDP (with AF_INET)

protocol: protocol within family– Usually already defined by domain & type, typically 0 (default)

25

Naming and Identifying Sockets - bind()int bind (int sockfd, struct sockaddr* myaddr, int

addrlen);

Bind a socket to a local IP address and port number. Returns 0 on success, -1 and sets errno on failure.

sockfd: socket file descriptor (returned from socket)myaddr: includes IP address and port number

– IP address: set by kernel if value passed is INADDR_ANY, else set by caller

– port number: set by kernel if value passed is 0, else set by caller

addrlen: length of address structure = sizeof (struct sockaddr_in)

26

TCP and UDP Port Namespaces• Allocated and assigned by the Internet Assigned

Numbers Authority (IANA)– see RFC 1700– ftp://ftp.isi.edu/in-notes/iana/assignments/port-

numbers• 1-512 standard services (see /etc/services); super-

user only• 513-1023 registered and controlled, also used for

identity verification; super-user only• 1024-49151 registered services/ephemeral ports• 49152-65535 private/ephemeral ports

27

Waiting for Connections - listen()

int listen (int sockfd, int backlog);

Put socket into passive state (wait for connections rather than initiate a connection). Returns 0 on success, -1 and sets errno on failure.

sockfd : socket file descriptor (returned from socket )backlog : bound on length of un-accept()ed connection

queue (connection backlog); kernel will cap, thus better to set high

28

Contact the Peer - connect()

int connect (int sockfd, struct sockaddr* servaddr, int addrlen);

Connect to another socket. Returns 0 on success, -1 and sets errno on failure.

sockfd : socket file descriptor (returned from socket )servaddr : IP address and port number of serveraddrlen : length of address structure = sizeof (struct

sockaddr_in)

Can use with UDP to restrict incoming datagrams and to obtain asynchronous errors

29

Welcome a Connection - accept()int accept (int sockfd, struct sockaddr* cliaddr,

int* addrlen);

Accept a new connection (first one of the queue of pending connections). Returns file descriptor or -1. Also sets errno.

sockfd : socket file descriptor (returned from socket )cliaddr : IP address and port number of client (returned from

call)addrlen : length of address structure = pointer to int set to

sizeof (struct sockaddr_in)– addrlen is a value-result argument: the caller passes the size of

the address structure, the kernel returns the size of the client’s address (the number of bytes written)

30

Sending and Receiving data

31

Send the Data - write()int write (int sockfd, char* buf, size_t nbytes);

Write data to a stream (TCP) or “connected” datagram (UDP) socket. Returns number of bytes written or -1. Also sets errno on failure.

sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to write• some reasons for failure or partial writes:

– process received interrupt or signal– kernel resources unavailable (e.g., buffers)

int send (int sockfd, char* buf, size_t nbytes , int flags);

32

Receive the Data - read()int read (int sockfd, char* buf, size_t nbytes);

Read data from a stream (TCP) or “connected” datagram (UDP)

socket. Returns number of bytes read or -1. Also sets errno on failure. Returns 0 if socket closed.

sockfd : socket file descriptor (returned from socket )buf : data buffer

nbytes : number of bytes to try to read

int recv (int sockfd, char* buf, size_t nbytes , int flags);

33

Send Data to Someone - sendto()int sendto (int sockfd, char* buf, size_t nbytes,

int flags, struct sockaddr* destaddr, int addrlen);

Send a datagram to another UDP socket. Returns number of bytes written or -1. Also sets errno on failure.

sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to readflags : see man page for details; typically use 0destaddr : IP address and port number of destination socketaddrlen : length of address structure = sizeof (struct

sockaddr_in)

34

Receive Data from Someone - recvfrom()int recvfrom (int sockfd, char* buf, size_t nbytes,

int flags, struct sockaddr* srcaddr, int* addrlen);

Read a datagram from a UDP socket. Returns number of bytes read (0 is valid) or -1. Also sets errno on failure.

sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to readflags : see man page for details; typically use 0srcaddr : IP address and port number of sending socket

(returned from call)addrlen : length of address structure = pointer to int set to

sizeof (struct sockaddr_in)

35

Tearing Down a Connection

36

Good Bye - close()int close (int sockfd);

Closes a socket and deletes descriptor from system tables. Returns 0 on success, -1 and sets errno on failure.

sockfd : socket file descriptor (returned from socket )

• Closes communication on socket in both directions. All data sent before close are delivered to other side (although this aspect can be overridden).

• After close() , sockfd is not valid for reading or writing.

37

Close in My Way - shutdown()int shutdown (int sockfd, int howto);

Force termination of communication across a socket in one or both directions. Returns 0 on success, -1 and sets errno on failure.

sockfd : socket file descriptor (returned from socket )howto :

– SHUT_RD to stop reading– SHUT_WR to stop writing– SHUT_RDWR to stop both

• shutdown() overrides the usual rules regarding duplicated sockets, in which TCP teardown does not occur until all copies have closed the socket.

38

Advanced Sockets

• Managing multiple connections– fork()/exec(): multiple server processes

– pthread_create(): multi-threaded server process

– (no calls): event-based server process

• Detecting data arrival– select() and poll() functions

• Synchronous vs. asynchronous connections

• Other socket options

39

Example of Use

• Taken from Beej’s Guide to Network

Programming (see the course web page)

• Client-server example using TCP

• For each client– Server forks new process to handle connection

– Sends “Hello, world”

40

TCP Connection

clientserver

SYN J

SYN K

ACK J+1

ACK K+1

socket

bind

listen

accept

socket

connect connection added to incomplete queue

connection moved to complete queue

connect completes

41

TCP Connection

clientserver

socket

bind

listen

accept

socket

connect

write

read

writeread

closeclose

42

UDP Connection

client

server

socket

bindsocket

sendto

recvfrom

sendto

recvfrom

close

43

Food for Thought

Framing messages on a byte stream … ?

• Problem– pass logical messages using a TCP connection– read() may return partial or multiple messages– how can receiver identify the end of a message?

• Try to come up with two or three methods• Hints

– string storage in C and Pascal– format strings with printf()

1

CS716


By Dr. Amir Qayyum

Lecture No. 5

3

The Big Picture

You are

here

Midterm exam

(estimated)

4

What We Know

• Networks are– Experiencing explosive growth– Providing wide range of services

• It is attributed to:– General purpose nature of computer networks

– Ability to add new functionality with software– High performance computers are now

affordable

5

and We Know …

• Connecting mainframes over long-distance telephone lines has turned into a big business!

• Lots of competing players– Computing industry

– Telephone carriers

– Service providers, operators, …

• Global, ubiquitous, heterogeneous networking ?– Issues of connectivity, service levels, performance,

…

6

What We Have Learned

• Carefully identify what we expect from a network

• Cost-effective connectivity– Accomplished through nested interconnection

of nodes and links

– Provides process-to-process communication services

– Should offer high performance using the metrics like latency and throughput

• This results in a packet-switched network

7

What is Our Approach

• A layered architecture as a guideline for design

• Protocols are central objects– Provides services to higher-level protocols

– Make a message exchange meaningful with peers

• Implement protocols in software– Define interfaces to invoke services

– Socket interface between applications and protocols

– “Similar” interface within the network subsystem

8

What Next ?

Start with a simplest possible network

Two nodes connected directlythrough some suitable medium

9

Point-to-Point Links

Reading: Peterson and Davie, Ch. 2

OutlineHardware building blocksEncodingFramingError DetectionReliable transmission

• Sliding Window Algorithm

10

Direct Link Issues in the OSI and Hardware/Software Contexts

transport

network

data link

physical

session

presentation

application

user-level software

kernel software (device drivers)

reliability

framing, error detection, MAC

encoding hardware (network adapter)

11

Hardware Building Blocks

• Nodes– Hosts: general-purpose computers

– Switches: typically special-purpose hardware

– Routers (connecting networks): varies

• Links– Copper wire with electronic signaling

– Glass fiber with optical signaling

– Wireless with electromagnetic (radio, infrared, microwave) signaling

12

Nodes – A Workstation Architecture

CPU(processor)

Cache $

MemoryI/O bus

Networkadaptor

to network

finite memory (implies limited

buffer space)

Device driver managing network adaptor which is

using system’s I/O bus

Memory access much slower

than CPU speed

memory bus

13

Links

• Physical media– twisted pair cable

– coaxial cable

– optical fiber

– space

• Media is used to propagate signals

• Signals are electromagnetic waves of certain frequency, traveling at speed of light

14

Electromagnetic Spectrum

Radio Inf rared UVMicrow ave Gamma ray

f (Hz)

FM

Coax

Satellite

TV

AM Terrestrial microw ave

Fiber optics

X ray

100

104 105 106 107 108 109 1010 1011 1012 1013 1014 1015 1016

102 106 108 1010 1012 1014 1016 1018 1020 1022 1024104

Wavelength = speed/frequency= 2 x 108 / 300= 667 meters

15

Signals Over a Link

• Signal is modulated for transmission– varying frequency/amplitude/phase to

receive distinguishable signals

• Binary data (0s and 1s) is encoded in a signal– make it understandable by the receiving

host

16

Bits Over a Link

• Bit streams may be transmitted both ways at a time on a point-to-point link– full-duplex

• Sometimes two nodes must alternate link usage– half duplex

17

Which Link to Use ?

• Cables– same room / building / site

Cable Typical Bandwidths Distances

Cat-5 twisted pair 10-100 Mbps 100 m

Thin-net coax 10-100 Mbps 200 m

Thick-net coax 10-100 Mbps 500 m

Multimode fiber 100 Mbps 2 km

Single-mode fiber 100-2400 Mbps 40 km

insulation

braided conductor

copper core

coax

twisted pair

glass core (fiber)

glass clading

plastic jacket

18

Leased Lines

• Across city / country

• Dedicated link from the telephone company

• Appears, but may not be a single link !!!

Service: DS1/T1 DS3 STS-1 STS-3 STS-12 ... STS-48

Bandwidth: 1.5M 44.7M 51.8M 155M 622M ... 2.5G

(bps)

19

Last-mile Links

• Most economical

• Home to network service provider

• To take benefit of an existing network

Service: POTS ISDN xDSL CATV

Bandwidth: 28.8 - 56 K 64 - 128 K 16 K - 55.2 M 20 - 40 M

(bps)

20

ADSL(Asymmetric Digital Subscriber Line)

• Connects the subscriber to the central office via the local loop

• Bandwidth depends on length of local loop

Centraloffice

Subscriberpremises

1.554– 8.448 Mbps

16– 640 Kbps

Local loop2.74 – 5.48 Km

21

VDSL(Very high data rate DSL)

• Connects the subscriber to the optical network that reaches the neighborhood

• Runs over short distances

• Symmetric

Centraloffice

Neighborhood opticalnetwork unit

STS-N

over fiber

Subscriberpremises

VDSL at 12.96– 55.2 Mbps

over 1000– 4500 feet of copper

22

CATV

• Uses existing cable TV (CATV) infrastructure– reaches 95% of households in U.S.

• Single CATV channel has bandwidth of 6 MHz

• Can be used in asymmetric way

• Currently achieves on a single channel:– 40 Mbps downstream (100 Mbps theoretical capacity)

– 20 Mbps upstream

• Multiple access on shared channel (IEEE 802.14)

23

Optical Communication

• Higher bandwidths

• Superior attenuation properties

• Immune from electromagnetic interference

• No cross-talk between fibers

• Thin, lightweight and cheap (the fiber, not the optical-electrical interfaces)

24

Wireless Links• Satellite links

• Provide a grid of medium and low orbit satellites– Geosynchronous satellite 600-1000 Mbps

– Low Earth Orbit (LEO) array ~400 Mbps

• Targeted at voice communication modems

• Teledesic supports 1440 16 kbps satellite-to-earth channels (~2 Mbps); 155.5 Mbps intersatellite channels

25

Wireless Links• Radio and infra-red frequency links

• 11 Mbps rates, 2.4 GHz band, distances of 50-150 meters– 5.2 GHz band, > 55 Mbps: HIPERLAN-1, IEEE

802.11a

• Bluetooth piconets: Infrared links, 1 Mbps, 10 meters

26

Encoding

27


• Reading: Peterson and Davie, Ch. 2

• Hardware building blocks• Encoding• Framing• Error Detection• Reliable transmission

– Sliding Window Algorithm

28

Encoding

• Signals propagate over a physical medium– modulate electromagnetic waves

– e.g., vary voltage

• Encode binary data onto signals that propagate

Signalling component

Signal

Bits

Node NodeAdaptor Adaptor

29

Encoding

• Problems with signal transmission– Attenuation: signal power absorbed by medium

– Dispersion: a discrete signal spreads in space

– Noise: random background “signals”

modulator demodulatora string

of signals

Digital data (a string of symbols)

Digital data (a string of symbols)

30

Advantages of Digital Transmission over Analog

• Reasonably low-error rates over arbitrary distances– Calculate/measure effects of transmission

problems

– Periodically interpret and regenerate signal

• Simpler for multiplexing distinct data types (audio, video, e-mail, etc.)

31

Advantages of Digital Transmission over Analog

• Examples of modulators-demodulators (modems)

• Electronic Industries Association (EIA) standard RS-232(-C)

• International Telecommunications Union (ITU) standard V.32 96 kbps modem

32

RS-232(-C)

• Communication between computer and modem

• Uses two voltage levels (+15V, -15V), a binary voltage encoding

• Data rate limited to 19.2 kbps (RS-232-C); raised in later standards

33

RS-232(-C)

• Characteristics

• Serial: one signaling wire, one bit at a time

• Asynchronous: line can be idle, clock generated from data

• Character-based: send data in 7- or 8-bit characters

34

RS-232 Timing Diagram

+15

-15

volt

age

Idle start 1 0 0 1 1 0 0 stop idle

time

35

RS-232

• One bit per clock

• Voltage never returns to 0V (0V is a dead / disconnected line)

• -15V is both idle and “1”; initiates the

send by pushing to 15V for one clock (start bit)

36

RS-232

• Minimum delay between character transmissions idle for one clock at –15V (stop bit)

• One character leads to 2+ voltage transitions

• Total of 9 bits for 7 bits of data (78% efficient)

• Start and stop bits also provide framing

37

Binary Voltage Encoding

• NRZ (non-return to zero)

• NRZI (NRZ inverted)

• Manchester (used by IEEE 802.3, 10 Mbps Ethernet)

• 4B/5B (8B/10B) in Fast Ethernet

38

Non-Return to Zero (NRZ)

• Encode binary data onto signals– e.g., 0 as low signal and 1 as high signal

– voltage does not return to zero between bits

• known as Non-Return to Zero (NRZ)

Bits

NRZ

0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0

39

Problem: Consecutive 1s or 0s

• Low signal (0) may be interpreted as no signal• High signal (1) leads to baseline wander• Unable to recover clock

– sender’s and receiver’s clock have to be precisely

synchronized

– receiver resynchronizes on each signal transition

– clock drift in long periods without transition

sender’s clock

receiver’s clock

40

Alternative Encodings• Non-Return to Zero Inverted (NRZI)

• Make a transition from current signal(switch voltage level) to encode/transmit a “one”

• Stay at current signal (maintain voltage level) to encode/ transmit a “zero”

• Solves the problem of consecutive ones (shifts to 0s)

41

Alternative Encodings• Manchester (in IEEE 802.3 – 10 Mbps

Ethernet)

• Split cycle into two parts– Send high--low for “1”, low--high for “0”

– Transmit XOR of NRZ encoded data and the clock

• Only 50% efficient (1/2 bit per transition)

42

Different Encoding Schemes

Bits

NRZ

Clock

Manchester

NRZI

0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0

43

4B/5B Encoding

• Every 4 consecutive bits of data encoded in a 5-bit code (symbol)– 4-bit pattern is “translated” to a 5-bit pattern (not addition)

• 5-bit codes selected to have no more than one leading 0 and no more than two trailing 0s– 00xxx (8 symbols) and xx000 (4 symbols) are illegal– 5 free symbols (non-data)

• Thus, never gets more than three consecutive 0s• Resulting 5-bit codes are transmitted using NRZI • Achieves 80% efficiency

44

Binary Voltage Encoding

• Problem: wide frequency range required, implying– Significant dispersion– Uneven attenuation

• Prefer to use narrow frequency band (carrier frequency)

• Types of modulation– Amplitude (AM)– Frequency (FM)– Phase / phase shift– Combination of these (e.g. QAM)

45

Amplitude Modulation

idle idle 1 idle idle 0 idle idle

time

46

Frequency Modulation

idle idle 1 idle idle 0 idle

time

47

Phase Modulation

idle idle 1 idle idle 0 idle idle

time

48

Phase Shift in Carrier Frequency

108 degrees difference in phasecollapse for 108 degrees shift

49

Review Lecture 5

• Simplest possible network – 2 nodes connected directly

• Building blocks – nodes and links• Nodes – workstation architecture• Links – several types, optical, wireless• Encoding – binary data into signals, RS 232• Binary voltage encoding – NRZ, NRZI,

Manchester, 4B/5B• Modulation schemes

1

CS716


By Dr. Amir Qayyum

Lecture No. 6

3

ITU’s V.32 - 9.6 kbps

• Communication between modems

• Analog phone line

• Uses a combination of amplitude and phase modulation– known as Quadrature Amplitude Modulation

(QAM)

• Sends one of 16 signals each clock cycle– transmits at 2400 baud, i.e., 2,400 symbols per

second

4

Constellation Pattern for V.32 QAM

For a given symbol:1. perform phase shift

2. change to new amplitude 450

150

• Points in constellation diagram– chosen to maximize error

detection– process called trellis coding

5

Quadrature Amplitude Modulation

• Same algorithm as phase modulation

• Can also change signal amplitude

• 2-dimensional representation– angle is phase shift

– radial distance is new amplitude

• Each symbol contains log2 16 = 4 bits– data rate is thus 4 x 2400 = 9600 bps

450

150

16-symbol example (V.32)

6

Generalizing the Examples

• What limits baud rate?

• What data rate can a channel sustain?

• How is data rate related to bandwidth?

• How does noise affect these bounds?

• What else can limit maximum data rate?

7

Bit Rate and Baud Rate

• Bit rate is bits per second

• Baud rate is “symbols” per second

• If each symbol contains 4 bits then data rate is 4 times the baud rate

8

What Limits Baud Rate ?

• Baud rates are typically limited by electrical signaling properties

• No matter how small the voltage or how short the wire, changing voltages takes time

• Electronics are slow as compared to optics

9

What data rate can a channel sustain ?How is data rate related to bandwidth ?

• Transmitting N distinct signals over a noiseless channel with bandwidth B, max. data rate can be 2B log2 N

• This observation is a form of Nyquist’s

Sampling Theorem– We can reconstruct any waveform with no

frequency component above some frequency “F” using only samples taken at frequency 2F

10

What else (besides noise) can limit maximum data rate ?

• Transitions between symbols introduce high frequency components into the transmitted signal

• Such components cannot be recovered (by Nyquist’s Theorem), and some information is lost

• Examples:– Pulse modulation uses only a single frequency (with

different phases) for each symbol, but the transitions can require very high frequencies

– Binary voltage encodings (0 Hz within symbols)– Eye diagrams show voltage traces for all transitions

11

How does Noise Affect these Bounds ?

• In-band (not high-frequency) noise blurs the symbols, reducing the number of symbols that can be reliably distinguished

• Shannon extended Nyquist’s work to

channels with additive white Gaussian noise (a good model for thermal noise)

• From Shannon’s Theorem :

Max. channel capacity C = B log2 (1+S/N)

12

Summary of Encoding

• Problems: attenuation, dispersion, noise

• Digital transmission allows periodic regeneration

• Variety of binary voltage encodings– High frequency components limit to short range

– More voltage levels provide higher data rate

• Carrier frequency and modulation– Amplitude, frequency, phase, and combination (QAM)

• Nyquist (noiseless) and Shannon (noisy) limits on data rates

13

Framing

14





15

Framing• Breaks continuous stream/sequence of bits into a

frame and demarcates units of transfer• Typically implemented by network adaptor

– Adaptor fetches/deposits frames out of/into host memory

Frames

BitsAdaptor Adaptor Node BNode A

16

Advantages of Framing

• Synchronization recovery– consider continuous stream of unframed bytes– recall RS-232 start and stop bits

• Multiplexing of link– multiple hosts on shared medium– simplifies multiplexing of logical channels

• Efficient error detection– frame serves as unit of detection (valid or invalid)– error detection overhead scales as log N

17

Problem … ?

Recognizing exactly the boundaries of a frameMust determine the first and last bit of a

frame

18

Approaches

• Organized by end of frame detectionmethod

• Approaches to framing– sentinel (marker, like C strings)– length-based (like Pascal strings)– clock-based

19

Approaches

• Other aspects of a particular approach– bit- or byte-oriented– fixed- or variable-length– data-dependent or data-independent

length

20

Framing with Sentinels

• End of frame: special byte or bit pattern

• Choice of end of frame marker– valid data byte or bit sequence e.g. 01111110

– physical signal not used by valid data symbol

8 16 16 8

Beginning sequence Header Body CRC

Ending sequence

21

Sentinel Based Approach

• Problem: special pattern appears in the payload• Solution: bit stuffing

– sender: insert 0 after five consecutive 1s

– receiver: delete 0 that follows five consecutive 1s

Node BNode A

x 0 1 1 1 1 1

x 1 1 1 1 1 0

0 x 0 1 1 1 1 1 0

x 1 1 1 1 1 0

22

Sentinel Based Approach

• Problem: equal size frames are not possible– frame length is data-dependent

• Sentinel based framing examples– High-Level Data Link Control (HDLC)

protocol

– Point-to-Point Protocol (PPP)

– ARPANET IMP-IMP protocol

– IEEE 802.4 (token bus)

23

Sentinels: HDLC

• Developed by IBM, standardized by OSI

• Bit-oriented, variable-length, data-dependent

• Special bit pattern 01111110 marks end of frame

• Insert 0 after pattern 011111 in data (bit stuffing)

24

Sentinels: HDLC

• At receiver end, if the frame received is:– 0111110

• bit stuffed, therefore receive only 011111

• error in end of frame marker, lose two frames

– 01111110: end of frame– 01111111: error, lose one or two frames

25

Sentinels: PPP

• Byte-oriented, variable-length, data-dependent• Special flag 01111110 for start-of-text

– address and control field uses default values (FF / 8E)

– protocol field used for demultiplexing (IP,LCP,…)

– LCP (Link Control Protocol) send control messages

• establishes link between two peers

• negotiates payload and checksum size

• Insert 0 after pattern 011111 in data (bit stuffing)

flag address control protocol payload checksum flag

26

Sentinels: ARPANET IMP-IMP• Interface Message Processors (IMP’s): packet-

switching nodes in the original ARPANET

• Byte-oriented, variable-length, data-dependent• Special bytes (aid to understand frame format

below)– DLE: data link escape

– STX/ETX: start and end of transmission

• DLE byte in data sent as two DLE’s (byte stuffing)

SYN SYN DLE STX header body DLE ETX

27

Sentinels: IEEE 802.4 (token bus)• Alternative to Ethernet (802.3) with fairer

arbitration• End of frame marked by encoding violation, i.e.,

physical signal not used by valid data symbol• Recall Manchester encoding

– low-high means “0”, high-low means “1”, low-low and high-high are invalid

• Byte-oriented,variable-length,data-independent• Technique also applicable to bit-oriented framing• Another example: Fiber Distributed Data Interface

(FDDI) uses 4B/5B

28

Length-based Framing

• Include payload length in header• e.g., DDCMP (byte-oriented, variable-length)

• e.g. RS-232 (bit-oriented, implicit fixed length)

• Problem: count field corrupted

• Solution: catch when CRC fails

8 148

SYN SYN Class Length

8 42

Header

16

Body CRC

29

Clock-based Framing

• Continuous stream of fixed-length frames– each frame is 125µs long (all STS formats) (why?)

• Clocks must remain synchronized• e.g., SONET: Synchronous Optical Network

– dominated standard for long distance transmission– multiplexing of low-speed links onto one high-speed

link– byte-interleaved multiplexing– payload bytes are scrambled (data XOR 127 bit-pattern)– STS-n (STS-1 = 51.84 Mbps)

30

SONET Frame Format (STS-1)

Overhead Payload

90 columns

9 row s

31

Clock-based Framing

• Problem: how to recover frame synchronization– 2-byte synchronization pattern starts each

frame (unlikely to occur in data)

– wait until pattern appears in same place repeatedly

32

Clock-based Framing

• Problem: how to maintain clock synchronization– NRZ encoding, data scrambled (XOR’d)

with 127-bit pattern

– creates transitions

– also reduces chance of finding false sync. pattern

33

SONET Frame Merging

• STS-1 merged bytewise round-robin into STS-3– unmerged (single-source) format called STS-3c

STS-1Hdr

STS-1Hdr

STS-1Hdr

STS-3cHdr

34

SONET Frame Merging

• Problem: simultaneous synchronization of many distributed clocks– not too difficult to synchronize clocks

such that first byte of all incoming flows arrives just before sending first 3 bytes of outgoing flow (buffering ? delays ?)

35

Clock-based Framing• Problem: simultaneous synchronization of many

distributed clocks

• Solution: payload frame floats within clock frame, part of overhead specifies first byte of payload

Frame 0

Frame 1

87 col

9 rows

36

Error Detection

37





38

Error Detection

• Why we need it ?– To avoid retransmission of whole

packet or message

• What to do if error detected ?– Discard, and request a new copy of

the frame:

• explicitly or implicitly

– Try to correct error, if possible

39

Error Detection

• Validates correctness of each frame

• Errors checked at many levels

• Demodulation of signals into symbols (analog)

• Bit error detection/correction (digital) our main focus– Within network adapter (CRC check)

– Within IP layer (IP checksum)

– Possibly within application as well

40

Error Detection

• Analog errors– Example of signal distortion– Discuss to illustrate input to digital level

• Hamming distance– Parity and voting– Concept and usefulness– Hamming codes

• Errors bits or error bursts• Digital error detection techniques: two-

dimensional parity, checksum, CRC

41

Analog Errors – Signal Distortion

• Consider RS-232 encoding of character ‘Q’

• Assume idle wire (-15V) before and after signal

• Calculate frequency distribution of signal A(f) using a Fourier transform

• Apply low-pass filter (drop high frequency components)

• Calculate signal using inverse Fourier transform

42

RS-232 Encoding of “Q”


+15

-15

volt

age

time

43

Frequency Distribution of ‘Q’

Encoding

0 1 2 3 4 5

5

0

volt

age-

tim

e

frequency (multiples of baud rate)

4

3

2

1

44

Limited-Frequency Signal Response(bandwidth = baud rate)


+15

-15

volt

age

45

Limited-Frequency Signal Response(bandwidth = baud rate/2)


+15

-15

volt

age

46

Review Lecture 6

• Bit rate and baud rate

• Nyquist and Shannon theorem

• Framing: demarcates units of transfer

• Advantages, problem: boundary

• End of frame detection approaches

• Sentinel, length, clock bsd, bit stuffing

• Error detection: avoid retransmission, discard

1

CS716


By Dr. Amir Qayyum

Lecture No. 7

3

Error Detection and Correction

• Possible binary voltage encoding symbol

• Neighborhoods and erasure region

+15

-15

volt

age

0

1

? (erasure)

• Possible QAM symbol

• Neighborhoods in green

• All other space results in erasure

Input to digital level: valid symbols or erasures

4

Error Detection: How ?

• How to detect error ?– Add redundant information to a

frame to determine errors

• Transmit two complete copies of data– n redundant bits for n-bit message

– Error at the same position in two copies go undetected

5

Error Detection: How ?

• We want only k redundant bits for an n-bit message, where k < < n– In Ethernet, 32-bit CRC for 12,000 bits

(1500 bytes)

• k bits are derived from the original message

• Both the sender and receiver know the algorithm

6

1-bit Error Detection with Parity

• Every code has even number of 1’s

000100

110010

011

001 101

11101

00 10

11

Valid Codes

Parity encoding: gray dots are invalid and indicate errors

If only 1 bit flips, it can be detected

7

1-bit Error Correction with Voting

• Every code is copied three times

000100

110010

011

001 101

1110 1

Valid codes

Gray dots correct to “1”

blue cots correct to “0”

- If only 1 bit flips, it can be corrected

- Even with 2 erasures, bit can be recovered

8

2-bit Erasure Correction With Voting

• Every code is copied three times

Remaining bit in a 2-erasure plane is

not ambiguous

000 100

110010

011

001 101

111

0??

?0?

??0

Cannot correct 1-error and 1-erasure

9

Hamming Distance (1950 Paper)

• Minimum number of bit flips between code words– 2 flips for parity– 3 flips for voting

• n-bit error detection– No code word changed into another code

word– Requires Hamming distance of n+1

10

Hamming Distance (1950 Paper)

• n-bit error correction– N-bit neighborhood: all code words

within n bit flips– No overlap between n-bit

neighborhoods– Requires Hamming distance of 2n+1

11

Hamming Codes (1950 Paper)

• Construction for 1-bit error-correcting codes

• Minimal number of check bits required

• Construction– Number of bits from 1 upward

– Powers of 2 are check bits

– All others are data bits

– Check bit j is XOR of all bits k such that (j AND k) = j

• – Example: 4 bits of data, 3 check bits

12

Error Bits or Error Burst• Common model of errors

– Probability of error per bit– Error in each bit independent of others– Value of incorrect bit independent of others

• Burst model– Probability of back-to-back bit errors– Error probability dependent on adjacent bits– Value of errors may have structure

• Why assume bursts?– Appropriate for some media (e.g., Radio)– Faster signaling rate enhances such phenomena

13

Digital Error Detection Techniques

• Two-dimensional parity– Detects up to 3-bit errors– Good for burst errors

• Internet checksum (used as backup to CRC)– Simple addition– Simple in software

• Cyclic redundancy check (CRC)– Powerful mathematics– Tricky in software, simple in hardware– Used in network adapter

14

Two-Dimensional Parity

• Adding one extra bit to a 7-bit code to balance 1s

• Extra parity byte for the entire frame

• Catches all 1, 2 and 3 bit errors and most 4-bit errors

• 14 redundant bits for a 42-bit message, in the example

1011110 1

1101001 0

0101001 1

1011111 0

0110100 1

0001110 1

1111011 0

Paritybits

Paritybyte

Data

15

Two-Dimensional Parity

0 1 0 0 0 1 1 1

0 1 1 0 0 0 1 1

0 1 1 0 1 1 1 1

0 1 1 0 0 0 0 0

1 0 0 1 0 0 1 1

0 1 1 0 1 1 0 0

1 1 0 1 0 1 0 0

0 1

1

0 1

1

16

Internet Checksum Algorithm

• Not used at the link level but provides same sort of functionality as CRC and parity

• Idea:– Add up all words (16-bit integers) that are transmitted– Transmit the result (checksum) of that sum– Receiver performs the same calculation on received data

and compares the result with the received checksum– If the results do not match, an error is detected

• 16 redundant bits for a message of any length• Weak protection, accepted as a last line of defense

17

Internet Checksum AlgorithmView message as a sequence of 16-bit integers; sum using 16-bit ones-complement arithmetic; take ones-complement of the result.

u_short cksum(u_short *buf, int count){

register u_long sum = 0;while (count--){

sum += *buf++;if (sum & 0xFFFF0000){/* carry occurred, so wrap around */

sum &= 0xFFFF;sum++;

}}return ~(sum & 0xFFFF);

}

18

Cyclic Redundancy Check

Theory• Based on finite-field (binary-valued)

arithmetic

• Bit string represented as polynomial

• Coefficients are binary-valued

• Divide bit string polynomial by generator polynomial to generate CRC

Practice• Bitwise XOR’s

19


• Add k bits of redundant data to an n-bit message– Want k << n

– e.g., k = 32 and n = 12,000 (1500 bytes)

• Represent n-bit message as n-1 degree polynomial– e.g., MSG=10011010 as M(x) = x7 + x4 + x3 + x1

– Sender and receiver exchange polynomials

• Let k be the degree of some agreed-upon divisor/ generator polynomial– e.g., C(x) = x3 + x2 + 1

20


• Transmit polynomial P(x) that is evenly divisibleby C(x) – Shift left k bits, i.e., M(x)xk

– Add remainder of M(x)xk / C(x) into M(x)xk

• Receiver receives polynomial P(x) + E(x)– E(x) = 0 implies no errors

• Receiver divides (P(x) + E(x)) by C(x); remainder will be zero ONLY if:– E(x) was zero (no error), or– E(x) is exactly divisible by C(x)

21

CRC Example - Sender

• C(x) = x3+x2+1 = 1101 generator• M(x) = x8 + x6 + x5 + x4+1 = 101110001 message

1101 ) 101110001000

1101

1101

1101

00001000

1101

101 Remainder

22

CRC Example - Receiver

• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message

1101 ) 101110001101

1101

1101

1101

00001101

1101

0 Correct !

23

CRC Example - Receiver• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message

1101 ) 101101001101

1101

1100

1101

1100

1101

1110

1101

111 Incorrect !

Two bits are flipped

24

CRC Example - Receiver• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message

1101 ) 101101011101

1101

1100

1101

1101

1101

1101

1101

0 Incorrectly correct !!!

Three bits are flipped

25

Selecting C(x) Non-divisible by E(x)• All single-bit errors, as long as the xk

and x0 terms have non-zero coefficients.

• All double-bit errors, as long as C(x) contains a factor with at least three terms

• Any odd number of errors, as long as C(x) contains the factor (x + 1)

26

Selecting C(x) Non-divisible by E(x)• Any ‘burst’ error (i.e., sequence of

error bits) for which the length of the burst is less than k bits

• Most burst errors of larger than k bits can also be detected

• See Table 2.5 on page 96 for common C(x)

27

Error Detection or Correction ?

• Detection implies discardingmessage and waiting for retransmission– Uses bandwidth

– Introduces latency

28

Error Detection or Correction ?

• Error correction requires more redundant bits to send all the time : Forward Error-correcting Code (FEC)

• Error correction is useful when:– Errors are quite probable (wireless links)

– Retransmission cost is too high (latency in satellite link, multicast)

29

Topics Already Covered

• Elements of networks: nodes and links

• Building a packet abstraction on a point-to-point link– Transmission methods and challenges

– Limiting factors on data rates

– Defining units of communication data

– Detecting transmission errors

• Next: simulating an error-free channel

1

CS716


By Dr. Amir Qayyum

Lecture No. 8

3

Reliable Transmission

• Higher level of abstraction (transport layer vs. data link layer)

mycomputer’s

nameis

my-machine

computer’s

ismy-machine

client

my-machine

server

mail.yahoo.com

4


• Higher level of abstraction (transport layer vs. data link layer

client

my-machine

server

mail.yahoo.commy

computer’s

nameis

my-machine

my-machineis

mycomputer’s

name

5


• Error-correcting codes are not advancedenough to handle the range of bit and burst errors– Corrupt frames generally must be discarded

– A reliable link-level protocol must recover from discarded frames

• Goals for reliable transmission– Make channel appear reliable

– Maintain packet order (usually)

– Impose low overhead / allow full use of link

6


• Reliability accomplished using acknowledgments and timeouts– ACK is a small control frame

confirming reception of an earlier frame

– Having no ACK, sender retransmits after a timeout

7


• Automatic Repeat reQuest (ARQ) algorithms– Stop-and-wait

– Concurrent logical channels

– Sliding window

• Go-back-n, or selective repeat

• Alternative: forward error correction (FEC)

8

Automatic Repeat reQuest

• Acknowledgement (ACK)– Receiver tells sender when frame received

– Cumulative ACK (used by TCP): have received specified frame and all previous

– Selective ACK (SACK): specifies set of frames received

– Negative ACK (NACK or NAK): receiver refuses to accept frame now, e. g. , when out of buffer space

9

Automatic Repeat reQuest

• Timeout: sender decides that frame was lost and tries again

• ARQ also called Positive Acknowledgement with Retransmission (PAR)

10

Stop-and-Wait• Send a single frame

• Wait for ACK or timeout– If ACK received, continue with next frame

– If timeout occurred, send again (and wait)• Frame lost in transit; or corrupted and discarded

Sender Receiver

Frame 0

Frame1

ACK0

ACK1

11

Acknowledgments and TimeoutsSender Receiver

Frame

ACK

Tim

eout

Tim

e

Sender Receiver

Frame

ACK

Tim

eout

Frame

ACKTim

eout

Sender Receiver

Frame

ACKTim

eout

Frame

ACKTim

eout

Sender Receiver

Frame

Tim

eout

Frame

ACKTim

eout

(a) (c)

(b) (d)

12

Stop-and-Wait

• If receiver receives a frame correctly, but sender receives the ACK after timeout …– Sender resends the frame; how the

receiver knows it’s the same frame or the next frame ?

13

Stop-and-Wait

• Requires frame identification– Duplicate frame ?

– Duplicate ACK ?

– 1 bit is enough (if physical network maintains order)

• sender tracks frame ID to send

• receiver tracks next frame ID expected

14

Stop-and-Wait State Diagram

send: 1expect: 0

send: 0expect: 0

send: 0expect: 1

send: 1expect: 1

receiveframe 0

receiveframe 1

receiveACK 1

receiveACK 0

receive frame 0 /receive ACK 1

15

Stop-and-Wait• Frames delivered reliably and in order• Is that enough ?

– No, we need performance, too.

• Problem: keeping the pipe full … ?

• Example– 1.5Mbps link x 45ms RTT = 67.5Kb (~8KB)– 1KB frames implies 182 Kbps (1/8th link utilization)– Want the sender to transmit 8 frames before waiting

for ACK– Throughput remains 182 Kbps regardless of the link

bandwidth !!

16

Concurrent Logical Channels• Multiplex several logical channels over a single

p-to-p physical link (include channel ID in header)

• Use stop-and-wait for each logical channel

• Maintain three bits of state for each logical channel:– Boolean saying whether channel is currently busy

– Sequence number for frames sent on this channel

– Next sequence number to expect on this channel

• ARPANET IMP-IMP supported 8 logical channels over each ground link (16 over each

17

Concurrent Logical Channels

• Header for each frame include 3-bit channel number and 1-bit sequence number– Same number of bits (4) as the sliding

window requires to support up to 8 outstanding frames on the link

18

Concurrent Logical Channels

• Characteristics– Separates reliability from flow control and

frame order

– Each channel limited by stop-and-wait bandwidth

– Aggregate bandwidth uses full physical link

– Supports multiple communicating processes

– Can use more than one channel per process• But no frame ordering between channels

19

Approaches for Reliable Transmission …

• Stop- and- wait– Provides reliable, in-order delivery

– Sacrifices performance

• Multiple logical channels– Provides reliable delivery at full link bandwidth

– Sacrifices packet ordering

• Sliding window: meets all three goals

20

Sliding Window• Allow sender to transmit multiple frames before

receiving an ACK, thereby keeping the pipe full• Upper bound on outstanding un-ACKed frames• Also used at the transport layer (by TCP)

Sender Receiver

Tim

e

……

21

Sliding Window Concepts• consider ordered stream of data

– broken into frames– stop-and-wait

• window of one frame• slides along stream over time

• sliding window algorithms generalize this notion– multiple-frame send window– multiple-frame receive window

time

22

Sliding Window Concepts

• send window– fixed length, containing numbered frames– starts at earliest unacknowledged frame– only frames in window sent over network

– Green: sent and acknowledged– Red: sent (or can be sent) but not acknowledged– Blue: available, but not within send window

time

23

Sliding Window Concepts

• receive window– fixed length (unrelated to send window)– starts at earliest unreceived frame– only frames in window are buffered

– Green: received and delivered– Red: received and buffered– Blue: received and discarded

time

24

Sliding Window - Sender• Assign sequence number to each frame (SeqNum)• Maintain three state variables:

– send window size (SWS)– last acknowledgment received (LAR)– last frame sent (LFS)

• Maintain invariant: LFS - LAR ≤ SWS

• Advance LAR when ACK arrives • Buffer up to SWS frames and associate timeouts

time

14 1512 1311 19 2017 1816

LAR=13 LFS=18≤ SWS

25

Sliding Window - Receiver• Maintain three state variables

– receive window size (RWS)– largest frame acceptable (LFA)– next frame expected (NFE)

• Maintain invariant: LFA – NFE+1 ≤ RWS

• Frame SeqNum arrives:– if NFE ≤ SeqNum ≤ LFA accept– if SeqNum ≤ NFE or SeqNum > LFA discarded

• Send cumulative ACKs

time

14 1512 1311 19 2017 1816

NFE=13 LFA=17≤ RWS

26

Sliding Window Issues• When a timeout occurs, data in transit decreases

– Pipe is no longer full when packet losses occur

– Problem aggravates with delay in packet loss detection

• Early detection of packet losses improves performance:– Negative Acknowledgements (NACKs)

– Duplicate Acknowledgements

– Selective Acknowledgements (SACKs)

• Adds complexity but helps keeping the pipe full

27

Sliding Window Classification

• Stop-and-wait: SWS=1, RWS=1

• Go-back-N: SWS=N, RWS=1

• Selective repeat: SWS=N, RWS=M(usually M = N)

selective repeat

go-back-N

stop-and-wait

28

Sliding Window: Go-back-N

• Go-back-N (SWS=N, RWS=1)

• Receiver only buffers 1 frame

• If frame lost, sender may need to resend N frames– i. e. , sender goes back N frames

• Variations– How long is the frame timeout?

– Does receiver send NACK for out-of-sequence frame?

29

Sliding Window: Selective Repeat

• Selective repeat (SWS=N, RWS=M)• Receiver buffers M frames• If frame lost, sender must resend only

– frames lost within receive window– frames past end of receive window

• Variations– How long is the frame timeout?– Use cumulative or per-frame ACK?– Does protocol adapt timeouts?– Does protocol adapt SWS and/or RWS?

30

Sequence Number Space

• SeqNum field is finite; sequence numbers wrap around

• Sequence number space must be larger than number of outstanding frames (SWS)

• SWS <= MaxSeqNum-1 is not sufficient– suppose 3-bit SeqNum field (0..7); SWS=RWS=7– sender transmits frames 0..6; which arrive successfully

(receiver window advances)– ACKs are lost; sender retransmits 0..6– receiver expecting 7, 0..5, but receives second

incarnation of 0..5 assuming them as 8th to 13th frame

31

Required Sequence Number Space ?

• Assume SWS=RWS (simplest, and typical)– Sender transmits full SWS– Two extreme cases at receiver

• None received (waiting for 0…SWS-1)• All received (waiting for SWS…2*SWS-1)

• All possible packets must have unique SeqNum• SWS < (MaxSeqNum+1)/2 or SWS+RWS < MaxSeqNum+1 is the correct rule

• Intuitively, SeqNum “slides” between two halves of sequence number space

32

Sliding Window Assumptions

• Sliding window protocol leads to in-order delivery of all frames, with the following assumptions– Frames can be delayed an arbitrary but finite

amount of time

– Frames can be lost– Frames can arrive with detectable errors– Frames cannot arrive with undetectable errors

– Frames arrive in the order sent (possibly with errors)

• Are these assumptions adequate?

33

Sliding Window Correctness

• Need one more assumption:– Any given frame is received without errors

after a finite number of retransmissions

• proof in two steps– establish correctness assuming infinite

sequence number space

– show that finite sequence number space does not affect result as long as it has >= 2 max (SWS, RWS) possible numbers

34

Separation of Concerns

• Carefully distinguish different functions rolled together in one mechanism– Each function is necessary here ?– Each function is supported in best effective way ?

• Sliding window combines– Reliable delivery– Ordered delivery– Flow control

• Is it the right thing to do at link level ?

1

CS716


By Dr. Amir Qayyum

Lecture No. 9

3


•SeqNum field is finite; sequence numbers wrap around

• Sequence number space must be larger than number of outstanding frames (SWS)

4


•SWS <= MaxSeqNum-1 is not sufficient– Suppose 3-bit SeqNum field (0..7); SWS=RWS=7

– Sender transmits frames 0..6; which arrive successfully (receiver window advances)

– ACKs are lost; sender retransmits 0..6

– Receiver expecting 7, 0..5, but receives second incarnation of 0..5 assuming them as 8th to 13th frame

5


• Assume SWS=RWS (simplest, and typical)– Sender transmits full SWS

– Two extreme cases at receiver

• None received (waiting for 0…SWS-1)

• All received (waiting for SWS…2*SWS-1)

6


• All possible packets must have unique SeqNum

•SWS < (MaxSeqNum+1)/2 or SWS+RWS < MaxSeqNum+1 is the correct rule

• Intuitively, SeqNum “slides” between

two halves of sequence number space

7

What Next ?

• Arbitrating access to a shared medium

• After that: network adapters and example protocols

8

Shared Access Networks

Outline• Bus (Ethernet)

• Token ring (FDDI)

• Wireless (802.11)

• Network Adapter

9

Shared Access Media

• Arbitrating access to a shared media– Multiple hosts on a single link

10

Multiple Access Media

• Multiple senders on some media– Buses (Ethernet, including links in

switched form)

– Radio, satellite

– Token rings…

11

Multiple Access Media

• Need method to moderate access– Fair arbitration– Good performance

12

Shared Media

• Communication needs vary– Over time

– Between hosts

• Network is not fully utilized

13

Shared Media

• Recall methods for multiplexing– Frequency-division multiplexing (FDM,

separate bands)

– Time-division multiplexing (TDM, synchronous time slots)

– Statistical TDM (STDM, time slots on demand)

• STDM most appropriate with stated assumptions

14

Shared Media: Problems• Problem: demands can conflict, e. g. ,

two hosts send simultaneously– STDM does not address this problem -

centralized

– Solution is a medium access control (MAC) algorithm

15

Shared Media: Solutions• Three solutions (out of many)

– Carrier sense multiple access with collision detection (CSMA / CD)

• Send only if medium is idle

• Stop sending immediately if collision detected

– Token ring/FDDI pass a token around a ring; only token holder sends

– Radio / wireless (IEEE 802.11)

16

Ethernet

17

History of Ethernet

• Developed by Xerox PARC in mid-1970s• Roots in Aloha packet-radio network• Standardized by Xerox / DEC / Intel in

1978• Similar to IEEE 802.3 standard• IEEE 802.3u standard defines Fast

Ethernet (100 Mbps)• New switched Ethernet now popular

18

Ethernet Topologies

• Bus— all nodes connected to a wire

• Star— all nodes connected to a central repeater

• Combinations thereof

. . .

19

Ethernet Adaptor

• Segment of up to 500 m• Nodes tap into segments• Taps must be 2.5m apart• Transceiver performs

carrier sensing• Transceiver transmits

and receive signals• Protocol is implemented

in the adaptor

Adaptor

Transceiver

Host

Ethernet cable

20

Ethernet – Alternative Technologies

• Can be constructed from a thinner cable (10Base2) rather than 50-ohm coax cable (10Base5)

• Newer technology uses 10BaseT (twisted pair)– Several point-to-point segments coming out of a

multiway repeater called “hub”

Hub Hub

21

Ethernet Components

10Base5 (ThickNet)

Controller (Ethernet Card)

Vampire Tap

Transceiver

Bus Topology

22

Ethernet Components

10Base2 (ThinNet)


BNC T-junctionTransceiver

Bus Topology

23

Ethernet Components

10BaseT (Twisted Pair)


HubStar Topology

24

Ethernet – Multiple Segments

• Repeaters forward the broadcast signal on all out going segments (10Base5)

• Maximum of 4 repeaters (2500m), 1024 hosts

Repeater

Host

…

…

…

25

Ethernet Packet Frame

• Preamble allows the receiver to synchronize with signal

• Frame must contain at least 46 bytes to detect collision

• 802.3 standard substitutes length with type field– Type field (demux key) is the first thing in data portion– A device can accept both frames: type > 1500

Destaddr

64 48 32

CRCPreamble Srcaddr

Type Body

1648

26

Ethernet Address

• Addresses– Unique, 48-bit unicast address assigned to each

adapter– Example: 8:0:e4:b1:2

– Broadcast: all 1s

– Multicast: first bit is 1

– Promiscuous mode

• Problem remains: A distributed algorithm that provides fair access

27

Ethernet MAC – CSMA/CD

• Multiple access– Nodes send and receive frames over a

shared link

• Carrier sense– Nodes can distinguish between an idle

and busy link

• Collision detection– A node listens as it transmits to detect

collision

28

CSMA/CD MAC Algorithm

• If line is idle (no carrier sensed)– Send immediately

– Upper bound message size of ~1500 bytes

– Must wait 9.6µs between back-to-back frames

29


• If line is busy (carrier sensed) …

– Wait until the line becomes idle and then transmit immediately

– Called 1-persistent (special case of p-persistent)

• If collision detected– Stop sending data and jam signal

– Try again later

30

Collision Detection

How to ensure that my-machine knows about the collision?

Start transmission

at time 0

my-machine your-machine

Start transmission

at time T

Almost there at time T

Collision !!!

31

Constraints on Collision Detection

• In our example, consider– my-machine’s message reaches your-

machine at T

– your-machine’s message reaches my-machine at 2T

• Thus, my-machine must still be transmitting at 2T

32


• Specifics of IEEE 802.3– Bounds 2T to 51.2 microseconds

– Packet must be at least 64B long

• Jam after the collision, for 32 bits, then stop transmitting frame (runt frame of 96 bits)– Ensures that all hosts notice collision

33

Review Lecture 9

• Shared access networks

• Shared media: issues

• Ethernet Topologies, technologies

• Segments

• Frame format, Addresses

• MAC protocol: CSMA/CD

1

CS716


By Dr. Amir Qayyum

Lecture No. 10

3

Ethernet MAC – CSMA/CD

• Multiple access– Nodes send and receive frames over a

shared link

• Carrier sense– Nodes can distinguish between an idle

and busy link

• Collision detection– A node listens as it transmits to detect

collision

4


• If line is idle (no carrier sensed)– Send immediately

– Upper bound message size of ~1500 bytes

– Must wait 9.6µs between back-to-back frames

5


• If line is busy (carrier sensed) …

– Wait until the line becomes idle and then transmit immediately

– Called 1-persistent (special case of p-persistent)

• If collision detected– Stop sending data and jam signal

– Try again later

6

Collision Detection

How to ensure that my-machine knows about the collision?

start transmission

at time 0


start transmission

at time T

almost there at time T

collision !!!

7


• In our example, consider– my-machine’s message reaches your-

machine at T

– your-machine’s message reaches my-machine at 2T

• Thus, my-machine must still be transmitting at 2T

8


• Specifics of IEEE 802.3– bounds 2T to 51.2 microseconds

– packet must be at least 64B long

• Jam after the collision, for 32 bits, then stop transmitting frame (runt frame of 96 bits)– ensures that all hosts notice collision

9

Ethernet Min. Frame Size

• RTT on a maximally configured Ethernet of 2500m, with 4 repeaters is about 51.2 μs

– 2500m / 2 x 108 m/s = 12.5 us

– 2 x 12.5 = 25 us + repeater delays

• 51.2 μs on 10 Mbps corresponds to 512 bits

(64 bytes)

• Therefore, the minimum frame length for Ethernet is 64 bytes (header +46 bytes data)

10

Collision Detection

my-machine notices collision before transmission ends at 2T

start transmission

at time 0


start transmission

at time T

almost there at time T

collision

11

Retry After the Collision• How long should a host wait to

retry after a collision ?– Binary exponential backoff

• maximum backoff doubles with each failure (exponential)

• after N failures, pick an N-bit number

• 2N discrete possibilities from 0 to maximum

12

Retry After the Collision

• Delay and try again algorithm–1st time: 0 or 51.2us–2nd time: 0, 51.2, or 102.4us–3rd time51.2, 102.4, or 153.6us–nth time: k x 51.2us, for randomly

selected k=0..2n - 1–Give up after several tries

(usually 16)

13

Binary Exponential Backoff

• Why used fixed time slots ?

• How long slots should be ?

time

choices after 1 collision

choices after 2 collisions

3Ts2TsTs0

time of last collision

14


• Ts is 51.2 microseconds for IEEE 802.3

• Consider that k hosts collide– each picks random number from 0 to 2( N-1)

– if minimum value is unique

• all other hosts see busy line

• remember that Ethernet RTT < 51.2 microseconds

15


• Consider that k hosts collide– if minimum value is not unique

• hosts in minimum value slot collide again

• following slot idle

• consider next smallest backoff value

16

Ethernet Frame Reception

• Sender handles all access control• Receiver simply pulls frames from

network

• Ethernet controller/card– sees all frames

– selectively passes frames to host processor

17

Ethernet Frame Reception

• Acceptable frames– addressed to host, or;

– to multicast address to which host belongs or;

– to a broadcast address

– anything if in promiscuous mode (packet sniffing and tcpdumprequirement)

18

Ethernet Collision Detection• Bus topology Ethernets

– Transceiver handles• Carrier detection• Collision detection• Jamming after a collision

– Transceiver sees voltage sum• Outgoing signal + Incoming signal

– Looks for voltages impossible for local alone• Attenuation can prevent detection• Limits segment length

19

Ethernet Collision Detection

• Hub topology Ethernets– Controller/card handles carrier detection– Hub handles

• Collision detection• Jamming after a collision

– Detect transmission activity on each line– If more than 1 line (host) active

• Assert collision to all lines• Continue until no lines active

– Attenuation is less critical

20

Experience With Ethernet

• Number of hosts limited to 200 in practice, standard allows 1024

• Range much shorter than 2.5 kmlimit in standard

• Round-trip time is typically 5 or 10 μs, not 50μs

21

Experience With Ethernet

• Higher-level flow control (i.e., TCP) limits load– 30% maximum load recommended

– Large packets recommended

• Star topologies easier to administer than Bus– Failure isolation, node addition w/o complex

topology

• (Backwards) Compatibility more important than elegance or raw performance (1 > 0)

22

Token Ring

23

Token Ring Overview

• Token Ring network “was” a candidate to replace

Ethernet; used in some MAN backbones– 16Mbps IEEE 802.5 (based on earlier 4Mbps IBM ring)

– 100Mbps Fiber Distributed Data Interface (FDDI)

24

Token Ring Rationale

• Why emulate a shared medium with point-to-point links?

• Why a shared medium?– convenient broadcast capabilities

– switches are costly

• Why emulation?– simpler MAC algorithm: only have 2 wires

– fairer access arbitration

– fully digital (802.3 collision detection requires analog)

25

IBM Token Ring – IEEE 802.5

• Data flows in a particular direction– a node receives frames from its

upstream neighbor

– a node forwards frames to its downstream neighbor

26

IBM Token Ring – IEEE 802.5

• Ring is viewed as a single shared medium– each node is allowed to transmit according

to some distributed algorithm for medium access

– all nodes see all frames; destination saves a copy of frame as it flows past

• The term “token” indicates the way the access to shared channel is managed

27

Token Ring State Diagram

No token(pass frames along)

Token(insert new frames)

get token release token

28

Token in a Token Ring

• Token is a special bit pattern that rotates around the ring– A node must capture token before

transmitting

– A node releases token after done transmitting

• Immediate release- token follows last frame (FDDI)

• Delayed release – after last frame returns to sender

29

Token Release

(a) immediate release (b) delayed release

30

Token in a Token Ring

• Remove your frame when it comes back around– Transmit another frame or re-insert

the token

• Stations get round-robin serviceas the token circulates around the ring

31

Review Lecture 10

• CSMA/CD MAC algorithm

• Constraints on collision detection

• Min frame size – RTT 51.2us

• Retries after collision

• Binary exp backoff: Min value unique

• Frame reception: acceptable frames

• Bus and Hub topology: CD handle

• Exp: nodes, dist, RTT, 30% load, largeframe

32

Review Lecture 10

• Token Ring

• MAN backbone

• IBM, 802.5 token ring, FDDI

• Rationale: emulate shared med p2p links

• Directional data flow

• Shared med: see all frames, distr MAC algo

• Token – special bit pattern

• Capture and release – round robin service

• When a node receives a claim frame, it

11

CS716


By Dr. Amir Qayyum

2

Lecture No. 11

3

Review Lecture 10

• CSMA/CD MAC algorithm

• Constraints on collision detection

• Min frame size – RTT 51.2us

• Retries after collision

• Binary exp backoff: Min value unique

• Frame reception: acceptable frames

• Bus and Hub topology: CD handle

• Exp: nodes, dist, RTT, 30% load, largeframe

4

Review Lecture 10

• Token Ring

• MAN backbone

• IBM, 802.5 token ring, FDDI

• Rationale: emulate shared med p2p links

• Directional data flow

• Shared med: see all frames, distr MAC algo

• Token – special bit pattern

• Capture and release – round robin service

5

Physical Properties

• A node is connected into the ring using a relay– Relay is open as long as the station is alive (a)

– Relay closes and bypasses the station if it is dead (b)

Host

From previoushost

To nexthost

Relay

(a)

Host

From previoushost

To nexthost

Relay

(b)

6

Physical Properties

• Several relays are packed into a single box: multi Station Access Unit (MSAU)– Stations can be added/removed by

plugging/unplugging

– IBM token ring requires MSAU but 802.5 does not

7

MSAU

Host

Host

Host Host

From previousMSAU

To nextMSAU

MSAU

8

Physical Properties

• Data rate can be 4 Mbps or 16 Mbps• Encoding of bits uses differential

Manchester

• Ring may have up to 250 (802.5) or 260 (IBM) nodes

• Physical medium is twisted pair (IBM Token Ring)

9

Token Ring MAC

• Network adaptor contains receiver, transmitter and some storage of bits between them

• Token circulates if no station has anything to send– Ring must have enough capacity to store entire

token

– At least 24 stations with 1-bit storage for 24-bit long token (if propagation delay is negligible)

– This situation is avoided by designating a monitor

10

Token Ring MAC

• Any station that has a data to send can seize token

• In 802.5, simply 1 bit in second byte token is modified

• First two bytes of modified token become preamble for the next frame

11

Token Ring MAC

• The station is then allowed to send one or more frames

• Destination address may be a multicast or a broadcast address

• Intended recipient copies the frame into a buffer

12

Token Ring MAC

• Sender is responsible for removingthese frames

• Frame longer than the capacity of ring is drained by the sender while still transmitting its later part

13

Frame Format

• “Illegal” Manchester codes in the start and end delimiters

• Frame priority and reservation bits in access control byte

• Demux key in frame control byte• A and C bits for reliable delivery, in status

byte

Body CRCSrcaddr

Variable48

Destaddr

48 32

Enddelimiter

8

Framestatus

8

Framecontrol

8

Accesscontrol

8

Startdelimiter

8

14

Token Ring MAC Issue

• A node captures the token and sends as much data as it has …

– A node could keep the token for arbitrarily long time, and other nodes are not given the turn

– Favors nodes with large data over nodes having a small message to send

• How much data a node is allowed to transmit each time it capture the token ?

15

Timed Token Algorithm

• Token Holding Time (THT)– Upper limit on how long a station can

hold the token– A node checks before putting each frame

on ring that its transmit time would not cause THT to exceed

– Long THT achieves better utilization with few senders

– Short THT helps when multiple nodes have data to send

16


• Token Rotation Time (TRT)–How long it takes the token to

traverse the ring.–TRT <= ActiveNodes x THT

+ RingLatency

17

Reliable Delivery

• The A and C bit in the packet trailer for reliability

• Both bits are initially set to 0

• Destination sets A bit if it seesthe frame and sets C bit if it copies the frame into its adaptor

18

Reliable Delivery

• If sender receives bit A=0, it assumes destination is absent (not functioning)

• If A bit is set but not bit C, it means destination is unable to accept the frame– Sender retransmits this frame later

19

Token Ring Packet Priorities

• Token contains 3-bit priorityfield

• Token priority changes over time by using reservation bitsin the frame header

20

Token Ring Packet Priorities

• A station willing to send priority npacket can set reservation bits to n, if this makes it lower in value– it captures the token when the current

sender releases it with priority set to n

• Strict priority scheme: no lower-priority packets get sent when higher priority packets are waiting

21

Token Maintenance

• Token rings have a designated monitornode

• Any station can become the monitor according to a well defined procedure

• Monitor is elected when the ring is first connected, or when the current monitor fails

22

Token Maintenance

• Monitor periodically announces its presence

• Claim token sent by a station seeing no monitor– if the sender receives back the claim

token, it becomes monitor

– if another station is also contending for monitor, some rule defines the monitor

23

Monitor’s Role

• May add additional delay into the ring

• Assures that a valid token is present in the ring because token may be lost:– No token when initializing the ring– Bit error corrupts token pattern– Node holding the token crashes

24

Monitor’s Role

• Maintains a timer to detect the missing token– Timer = NumStations x THT +

RingLatency

– When times expires, it creates a new token

• Checks for corrupted frames– checksum error, invalid format, may

circulate forever

– Monitor drains them off the ring

25

Monitor’s Role

• Checks for orphaned frames– sending station died before removing the

frame

– “Monitor” bit is used to detect orphaned frames

– Initially 0, and set to 1 when a frame passes the monitor

– If the monitor sees a frame with this bit set, it drains it off

26

Review Lecture 11

• Token Ring phy properties: MSAU, speed,#

• MAC, storage, circulation, seize+ modify

• Send 1+ frame, recipient cpy, remove frame

• Frame format, resv, priority, A, C bits

• MAC issues, THT, TRT

• Reliable delivery, A, C

• Frame priorities, resv

• Maintenance, monitor, claim, role

1

CS716


By Dr. Amir Qayyum

Lecture No. 12

3

Fiber Distributed Data Interface• Similar to 802.5/IBM token rings but runs on fiber• Consists of a dual ring: two independent rings that

transmit data in opposite directions at 100Mbps

• Tolerates a single link break or node failure (self-healing ring)

(a) (b)

4

FDDI - Concentrator• Allows nodes to attach using a single cable - SAS• Dual connected nodes still exist - DAS• Concentrator attaches several SASs to dual ring

– Uses optical bypass to isolate failed SAS

Downstreamneighbor (DAS)

Upstreamneighbor (DAS)

Concentrator (DAS)

SAS SAS SAS SAS

5

FDDI – Physical Properties

• Variable size buffer (9 - 80 bits) between input and output interfaces (10 ns bit time)– Not required to fill buffer before

starting transmission

• Maximum 500 stations, maximum 2 km distance between any pair of stations

6

FDDI – Physical Properties

• Total 200 km fiber: dual nature implies 100 km cable connecting all stations

• Physical media can be coax or twisted pair cable

• Uses 4B/5B encoding

7


• Token Holding Time (THT)– Upper limit on how long a station can

hold the token

– configured to some suitable value

• Token Rotation Time (TRT)– How long it takes the token to traverse

the ring (time since a host released the token)

–TRT <= ActiveNodes x THT + RingLatency

8


• Target Token Rotation Time (TTRT)–“agreed-upon” or negotiated

upper bound on TRT

9

MAC Algorithm

• Each node measures TRT between successive token arrivals

• If measured-TRT > TTRT–Token is late

–Can not send data

10

MAC Algorithm

• If measured-TRT < TTRT– Token is early so OK to send

– Send data for remaining time until either

• No more data to send

• THT >= (TTRT – measured TRT)

11

FDDI MAC Issue

• If a node has lots of data, it holds the token for the maximum allowed time

• When a downstream neighbor gets the token, its measured TRT >= TTRT–It cannot transmit its frame!

12

FDDI MAC Issue

• What if the downstream neighbor has some urgent data to send ?

13

FDDI Traffic Classes

• Synchronous traffic–Latency sensitive–Gets higher priority–Can always send data

14

FDDI Traffic Classes

• Asynchronous traffic–Sensitive to throughput rather

than delay–Lower priority–Can send only if token is early

• May cause the time to exceed by one FDDI frame

15

Bounded Priority Traffic

• If a node has large amount of synchronous data– It will send regardless of measured TRT

– TTRT will become meaningless !!!

• Therefore, total synchronous data during one token rotation is bounded by TTRT

16

Bounded Priority Traffic

• Worse case: 2xTTRT between seeing token– One TTRT is consumed first by

asynchronous data

– Another TTRT is then consumed by synchronous data

• Back-to-back 2xTTRT rotations not possible

17

Token Maintenance

• Monitoring for the lost token–No token when initializing ring

–Bit errors corrupt token pattern

–Node holding the token crashes

18

Token Maintenance

• Monitoring for a valid token–Should periodically see valid

transmission (frame or token)

–Max. gap = ring latency + max frame <= 2.5ms

• Set 2.5ms timer; start negotiations if it fires

19

Token Maintenance• The procedure when a node

– Joins the ring (startup)– Suspects a failure

• Claim frame is used in order to– Generate a new Token– Agree on TTRT (so that an

application can meet its timing constraints)

• A node can send a claim frame without holding the token

20

Token Maintenance Procedure

• A node sends a claim frame including its TTRT bid

• When a node receives a claim frame, it compares the bid with its own bid– If its bid is higher, it updates TTRT &

forward the frame

– If its bid is lower, it replaces with its own claim frame

– If bids are equal, higher address node wins

21

Token Maintenance Procedure

• If a node’s claim frame

returns back to it, it knows:–Its bid was the lowest–Everyone knows TTRT–It can now insert new token

22

Frame Format• 4B/5B control symbols for start and end of frame• Control Field

– 1st bit: asynchronous (0) versus synchronous (1) data– 2nd bit: 16-bit (0) versus 48-bit (1) addresses– Last 6 bits: demux key (includes reserved patterns for

token and claim frame)

• Status Field– From receiver back to sender; error in frame– Recognized address; accepted frame (flow control)

Body CRCSrcaddr

Variable48

Destaddr

48 32

End offrame

8

Status

24

Control

8

Start offrame

8

23

Feedback

• Error detection– Host attaches “error” marker to frame

– Sender detects error marker, resends later

• Flow control– Host attaches “my address but did not

copy”

– Sender detects problem, resends later (backs off)

24

Wireless LANs

25

Wireless LANs

• IEEE 802.11 standard– Designed for use in a small area (offices,

campuses)

• Bandwidth: 1, 2 or 11 Mbps– Up to 56Mbps in newer 802.11a standard

• Targets three physical media– Two spread spectrum radio (2.4GHz freq)

– One diffused infrared (10m range, 850 nm band)

26

Spread Spectrum

• Spread signal over wider freq band– Uses more frequency spectrum than

strictly necessary

• Originally designed to thwart jamming/ interference– Pseudo-random sequence, signal looks like

a noise

• Introduce pseudo-random componentinto signal

27

Spread Spectrum

• Sender and receiver share– Pseudorandom number generator and

the seed

• Frequency Hopping– Transmit over pseudo-random

sequence of frequencies

– 802.11 uses 79 x 1MHz-wide frequency bands

28

Direct Sequence Spread Spectrum

• For each bit, send XOR of the bit and n random bits

• Random sequence is known to sender and receiver

• n random bits are called n-bit chipping code

29

Direct Sequence Spread Spectrum

• 802.11 defines an 11-bit chipping code 83MHz band

Random sequence: 0100101101011001

Data stream: 1010

XOR of the two: 1011101110101001

0

0

0

1

1

1

11

CS716


By Dr. Amir Qayyum

2

Lecture No. 13

3

802.11 MAC: CSMA/CA

• Similar to Ethernet …

– Defer the transmission until the link becomes idle

– Take back off if collision occurs

• Is it sufficient ?

• All nodes are not always within reach of (to hear) each other

4

Hidden and Exposed Nodes• Hidden nodes

– Sender thinks its OK to send when its not (false +ve)– A-C and B-D are hidden nodes in the figure below

• Exposed nodes– Sender does not send when its OK to send (false –ve)– B and C are exposed nodes in the figure below

A B C D

5

Multiple Access with Collision Avoidance (MACA)

• Sender transmits RequestToSend(RTS) frame– contains intended time to hold the

medium

• Receiver replies with ClearToSend (CTS) frame

6

Multiple Access with Collision Avoidance (MACA)

• Neighbors of the receiver …

–hear CTS: keep quiet for the intended duration (or till the ACK is heard)

• Neighbors of only the sender–hear RTS but not CTS: OK to

transmit

7

MACA for Wireless (MACAW)

• Receiver sends an ACKwhen it receives a frame–Signal for neighbors to start

their transmission

–Its neighbors are silent until they hear ACK

8

MACA for Wireless (MACAW)

• Collision detection–No active collision detection

–Known only if CTS or ACK is not received

–Binary exponential back off (BEB) is used in case of collision, like in Ethernet

9

802.11 - Supporting Mobility

• Suitable for an ad hocconfiguration of nodes–May or may not be able to

communicate with all other nodes

–Depends upon how far apartthe nodes are

10

802.11 - Supporting Mobility

• Nodes are free to move around–Set of directly reachable

nodes may change over time (partial connectivity)

–Topology changes over time

11

802.11 - Distribution System

• Nodes roam freely but operate within a structure– Tethered by wired network

infrastructure (Ethernet ?)

– Each Access Point (AP) services nodes in some region

– Each mobile node associates itself with an AP

12

802.11 - Distribution System

BH

A

F

G

D

AP-2

AP-3AP-1

EC

Distribution system

13

Managing Connectivity/Roaming

• How wireless nodes select Access Point ?

• Scanning (active search for an AP)– node sends Probe frame

– all AP’s within reach reply with Probe Response frame

– node selects one AP; sends it Associate Request frame

– AP replies with Association Response

– new AP informs old AP via wired backbone

14

Managing Mobility

• Scanning is used only when node has no AP

• Consider– Node moves out of range of it’s AP

and initiates scanning for new AP– Scanning takes 2+ round-trip times– No service until new AP found

15

Managing Mobility

• However–Node was probably in range of

new AP before leaving range of old AP

–Scanning too expensive to use continuously

• How handoff between APs managed as users move?

16

Managing Connectivity• Active scanning: when a node join or move• Passive scanning: AP periodically sends Beacon frame, advertising its capabilities

BH

A

F

G

D

AP-2

AP-3AP-1

EC

C

Distribution system

17

Managing Mobility

• Solution to handoff: AP’s send

periodic beacon frames

• Node may switch AP in response to beacon– if signal strength better from new AP

– using geographical information

– any other reason …

18

Managing Mobility

• 802.11 frames contain four addresses–actual sender and receiver

(source and dest.)

–AP for sender and receiver (routing in backbone)

19

Frame Format

• Control field contains three subfields:– 6-bit Type field (data, RTS, CTS, scanning);

– 1-bit ToDS; and

– 1-bit FromDS

• A single frame contains up to 2312 bytes of data

Addr4 PayloadSeqCtrlAddr3Addr2Addr1 CRC

0– 18,4964816 32484848

Duration

16

Control

16

ToDS=0, FromDS=0 C A

ToDS=1, FromDS=1 E AP-3 AP-1 A

20

Network Adaptors

21

Now you know …

• How direct link networks functions

• You know what the design issues are

• Most of the issues of direct link networks are handled by the network adaptors

• Lets have a quick look on design issues of these pieces of hardware

22

Overview

• Also called network interface card (NIC)

• Components (high-level overview)

• Options for use– Data motion

– Event notification

• Potential performance bottlenecks• Programming device drivers

23

Typical Workstation Architecture

CPU

Cache $

MemoryI/O bus

Networkadaptormemory

bus

communication ?

to network

Typically where data link functionality is implemented

24

Components of a Network Adaptor• Bus interface communicates with a specific host

– Bus defines protocol for CPU-adaptor communication

• Link interface speaks correct protocol on network– Implemented by a chip set, in software or on FPGA

• Buffering between different speed bus and link

Hos

t I/O

bus

Network Adaptor

Bus Interface

Link Interface

network

25

Host Perspective

• Adaptor is ultimately programmed by CPU

• Adaptor exports a Control Status Register (CSR)

• CSR is readable and writablefrom CPU at some memory address

26

Host Perspective

• CSR write: CPU instructs Adaptor (e.g., transmit)

• CSR read: Adaptor informs CPU (e.g., receive error)

• Host CPU either poll CSR or interrupted by OS– Interrupt handler routine disables

interrupts and is short

27

Data Motion Options for Network Adaptor Use

• Transfer frames between adaptor and host memory

• Programmed input/output (PIO)– Processor manages itself each

access (loads/stores)– Faster than DMA for small amounts

of data

28

Data Motion Options for Network Adaptor Use

• Direct memory access (DMA)– Adaptor gets buffer descriptor lists

by host for read/write– Processor is not involved: free to do

other things– Can be faster than memory copy

through CPU– Start-up cost

29

Data Motion - DMA

100

1400

1500

1500

1500

Bufferdescriptor

list

Memory buffers

…

30

Gather WriteScatter Read

DMA: Scatter Read, Gather Write

Adaptor memory

Host memory

Host memory

Adaptor memory

31

Data Motion - PIO

• Frames are buffered at adaptor until CPU transfers them

• Adaptors have 128-512 KB of expensive dual-port memory

Host

Adaptor

CPU

Memory

Memory

32

Data Motion

CPU

Cache $

MemoryI/O bus

Networkadaptormemory

bus

to network

Data movement path using PIO

Data movement path using DMA

33

Network Adaptor: Event Notification

• Hardware interrupts–Processor free to do other things

–Events delivered “immediately”

–State (register) save/restore expensive

–Context switches more expensive

34

Network Adaptor: Event Notification

• Event polling–Processor must periodically

check

–Events wait until next check

–No extra state changes

35

Device Drivers

• Operating system routines anchoring protocol stack to network hardware

• Initialize device, transmit frames, field interrupts

• Code contains device specific details– Difficult to read but simple in logic

36

Device Drivers

• Translates host’s message

into format expected by the device

• Sets CSR (e.g., to instruct device to transmit)

37

Programming Device Drivers

• Sample device driver in P&D sec 2.9.3• Better device driver(s) in Linux source

• Key features– Memory-mapped control registers

– Interrupt-driven

– Handler code must execute quickly

– Logically concurrent with other processes

38

Performance Bottlenecks

• Link capacity

• Processor computing power

• I/O bus bandwidth–Overhead involved in each bus

transfer

39

Performance Bottlenecks

• Memory bus bandwidth–Memory hierarchy with cache

levels

–Memory accesses results in multiple memory copies in different buffers

40

Memory Bandwidth on an Alpha

CPU

L1cache

L2cache

crossbar

Mainmemory

16 Gbps

4.48 Gbps

956 Mbps

800Mbps

I/O bus

41

Review Lecture 13

• 802.11 MAC CSMA/CA

• Hidden and exposed nodes

• MACA, RTS/CTS

• MACAW, ACK, BEB

• Mobility support, partial connectivity

• DS, AP, scanning, active passive

• Frame format

42

Review Lecture 13

• Network adaptors, Workstation arch• Components of NIC, Bus + link

interface, buffering, CSR• Data movement, PIO, DMA, Scatter

read, gather write• Event notification, interrupt, polling• Device drivers• Performance bottlenecks

11

CS716


By Dr. Amir Qayyum

2

Lecture No. 14

3

What we know …

• Elements of networks: nodes and links• Building a packet abstraction on a link• Transmission, and units of communication

data• Detecting transmission errors• Simulating an error-free, reliable channel

– Sliding window mechanism

• Arbitrating access to a shared medium• Design issues of direct link networks

– Functionality of network adaptors

4

What Next …?

• Moving on from direct to indirect networks

• Introducing switches which provide indirect connectivity

5

Switching and Forwarding

OutlineStore-and-Forward SwitchesBridges and Extended LANs Cell SwitchingSegmentation and Reassembly

6

Why Switching ?

• Motivation:– Why not just one direct link network ?

• Basic approach:– How can we extend the direct link

abstraction (provide illusion of one physical network) ?

• Challenges:– What problems must we address ?

7

Why Switching ?

• Examples:– Where are these issues addressed in real

networks ?

• Details of the switch:– What are the goals in design / how are

they addressed ?

• Heterogeneity– switching allows multiple physical netwrk

– but assume one switching strategy

8

After Switching – Are We Done ?

• Scale– direct link networks: O(100) hosts

– packet-switched networks: O(100,000) hosts

– Internet: O(2 year-1974) hosts

• Beyond the basics– quality of service

– congestion and performance analysis

– network trends and their importance

9

Connecting Large Networks

• Assert: want to use one direct link network

• Limitations of directly connected networks:– Limited Scale - number of hosts that can be

attached

• 1024 in Ethernet; only 2 in point-to-point link

– Limited geographical area that can be covered

• 2500 m in Ethernet; Point-to-point links also limited

10

Connecting Large Networks

• Alternative: provide illusion of one physical network

• Solution: Indirect connectivity by using switches– Packet switches in computer networks

control frame flow

– Multiple direct link networks, transparent to application

11

Packet Switches• A multi-input multi-

output device• Local star topology• Performance independent

of connectivity– (e.g. adding new host) if

switch is designed with enough aggregate capacity

• Maximum degree < physical network limit

12

Build Network from Stars

• Switches (or stars) to build networks that do not behave like in a star topology

13

Forwarding

• Packets arrive at one of the several inputs and have to be forwarded / switched to one of the available outputs– Connectionless and connection-oriented approach to

determine the correct output

Which way should it go ?

First challenge: forwarding

14

Routing

• Forwarding requires information

Second challenge:

routing

How to maintain forwarding information ?

15

Contention and Congestion

• If arrival rate for a certain output is greater than the output capacity, then contention occurs

• If arrival rate of packets is too high to cause buffer overflow, then congestion occurs

Who goes first ?

Any one is dropped ?

16

Challenges for Packet Switching

• Efficient forwarding– Switch with several output ports– Decide which output port to use

• Routing in dynamic network– Need information for forwarding– Construct and maintain the

information

17

Challenges for Packet Switching

• Handling contention–Multiple packets destined for

one output port–Decide which packet goes first–Decide what to do with others

18

Outline

• Switches and layered perspective

• Efficient forwarding

• Asynchronous transfer mode (ATM) example

• Switch fabrics and contention

19

Network Layers and Switches

One or more nodes

within the network

User level

OS kernel

host

switchswitch

between different physical layers

transport

network

data link

physical

session

presentation

application

network

data link

physical

20

Scalable Networks• Switch

– forwards packets from input port to output port– port selected based on address in packet header

• Advantages – cover large geographic area (tolerate latency)– support large numbers of hosts (scalable bandwidth)

Input ports

T3T3

STS-1

T3T3STS-1

Switch

Output ports

21

Packet Forwarding Analogy

• Process of going from one place to another

• Focus on decision process at intersections

• Path splits, how do you decide which way to go?

• How do you navigate at intersections? 3 scenarios:– from your office to home

– from home to a friend’s house (with directions)

– from Airport to the Hotel (without directions)

22

Packet Switching / Forwarding

• Forwarding: the task of selectingan appropriate output port for a packet

• Goals– Require limited information (both

packet and switch)

– Admit efficient implementation

23

Packet Switching / Forwarding

• Three approaches– Datagram or connectionless

approach

– Virtual circuit or connection-oriented approach

– Source routing

• Important notion: unique global address per host

24

Datagram Switching / Forwarding

• Every packet contains enough information– Enables switch to decide how to forward it

• Switch translates global address to output port– Maintains forwarding table for translations

• Each packet forwarded and travels independently

25

Datagram Switching / Forwarding

• No connection setup phase (connectionless model)

• Analogy:– Postal system: each packet contains

complete address for its destination

– Following signs (provided by switches) to reach destination

26

Datagram Switching• Managing tables in large, complex networks with

dynamically changing topologies is a real challenge for the routing protocol

01

3

2

0

13

2

0

13

2

Switch 3Host B

Switch 2

Host A

Switch 1

Host C

Host D

Host E

Host F

Host G

Host H

At switch 1:Dest Port#/InterfaceA 2B 1C 3D 0E 1… …

27

Datagram Switching• What happens if the destination is unknown ?• Network discards packet

– Possibly notifying the sender (“no route to host”)

01

3

2

0

13

2

0

13

2

Switch 3Host B

Switch 2

Host A

Switch 1

Host C

Host D

Host E

Host F

Host G

Host H

dataB

dataF

dataA

dataC

dataK

E C

B A

C F

D B

A K ? ?

28

Datagram Model

• No round trip time delay waiting for connection setup– Host can send data anywhere, anytime as soon as it

is ready– Source has no way of knowing if the network is

capable of delivering a packet or if the destination host is even up

• Packets are treated independently– Possible to route around link and node failures

dynamically

29

Datagram Model

• Every packet carry full address of the destination– Overhead per packet is higher than for

the connection-oriented model

– Global address to path translation requires storage

– Might not be possible to deliver packet (dest unknown)

30

Virtual Circuit Switching

• Explicit connection setup (& tear-down) phase from source to destination: connection-oriented model– Subsequence packets follow established

circuit

• Supporting “connections” in network

layer may be useful for service notions

31

Virtual Circuit Switching

• Each switch maintains a VC table(connection state) per-link or per-switch

• Analogy– Phone call: each packet follows an

established path

– Following a known route to reach the destination

32

VC Tables in VC Switching

• VC table contains information for each connection– incoming / outgoing interface (port)

– incoming / outgoing VCI (virtual circuit identifier)

• Permanent (PVC) or switched (signaled) virtual circuit (SVC)

33

VC Tables in VC Switching

• Setup message in signaling process (to create VC table) is forwarded like a datagram

• Acknowledgment of connection setup to downstream neighbors to complete signaling– Data transfer phase can start after

ACK is received

34

Signaling in VC Switching

• Setup message is forwarded from Host A to Host B

• On connection request, each switch creates an entry in VC table with a VCI for the connection

013

2

2

1

3

0

0

13

2

Switch 3Host B

Switch 2

Switch 1

Host A

I/F VCI I/F VCIin in out out

setup B

setup B

setup B

setup B

2 5 1


2 7 3


3 9 0

35

Signaling in VC Switching

• Host B accepts connection from Host A, and sends back an ACK

• In ACK, everyone communicates its choice of VCI to its upstream neighbor

013

2

2

1

3

0

0

13

2

Switch 3Host B

Switch 2

Switch 1

Host A


ACK 4

2 5 1


2 7 3


3 9 0

ACK 7

ACK 9

ACK 5

4

7

9

36

Data Transfer in VC Switching

• Host A knows that everything is in place all the way to Host B

• In data packets, each node then puts the VCI of its downstream neighbor

013

2

2

1

3

0

0

13

2

Switch 3Host B

Switch 2

Switch 1

Host A


data 5

2 5 1 9


2 7 3 4


3 9 0 7

data 9data 7

data 4

37

Virtual Circuit Model

• Typically wait full RTT for connection setup before sending first data packet– Can not avoid failures dynamically;

must re-establish connection (old one is torn down to free storage space)

38

Virtual Circuit Model

• Each data packet contains only a small identifier, making the per-packet header overhead small– Global address to path information still

necessary

• Connection setup provides an opportunity to know network conditions and/or reserve resources.– Avoiding congestion but under-utilizing

switch

39

Review Lecture 14

• Direct to indirect nets: switches provide ic• Motivation, approach, challenges• Heterogeneity, scale, QoS, congestion• Limits: scale, area, Sol: illusion of 1 net• Packet switches: MIMO, star• Challenges: forwarding, routing, contention• Forwardg: decision process at intersections,

selecting appropriate output port

40

Review Lecture 14

• 3 approaches: DG, VC, source

• pkt cont enough info, switch translates addr to out port, pkt forwarded independen, no con setup ph, no RTT delay, dest unknown, full addr overhead, route around failure

• Con setup, pkt follow establi circuit, switch maintain VC table, signaling (in datagram) + ack, wait 1 RTT for data, not avoid failures, small overhead, reserve resources

1

CS716


By Dr. Amir Qayyum

Lecture No. 15

3

Source Routing

• Packet header contains sequenceof address/ports on path from source to destination– One direction per switch: port, next

switch; (absolute)

– Switches read, use, and then discard directions

4

Source Routing

• All forwarding/topology information required to switch a packet is provided by source host

• Used in some system area networks (SANs)

• Directions may be rotated instead of discarding

5

Data Transfer in Source Routing

• Analogous to following directions

0

13

2

2

1

3

0

0

13

2

Switch 3

Host B

Switch 2

Switch 1

Host A

data 0 1 3

data 3 0 1

data 1 3 0

data 3 0 1

data 1 0 3

data 2 3 0 1

6

Source Routing Model

• Source host needs to know the correct and complete topology of the network– Changes must propagate to all hosts

• Packet headers may be large and variable in size: the length is unpredictable

7

Source Routing Model

• Each switch needs to correctly and efficiently manipulate the header information– Rotation or stripping of address

– Pointer to current address

• Can be used in datagram or virtual circuit networks

8

Forwarding Performance

• Assume switch is– General-purpose workstation– With DMA support– Multiple network adapters (NIC’s)

• Switching process– Packet arrives on NIC 1– NIC 1 DMA’s packet into memory

– CPU looks at header, decides to send on NIC 2– NIC 2 DMA’s packet into NIC 2 memory

– Packet leaves via NIC 2

9

Implementation and Performance

• Packet arriving at interface 1 has to go on interface 2• Point of contention for packets: I/O and memory bus

CPU

Main memory

I/O bus

Interface 1

Interface 2

Interface 3

10

Implementation and Performance• The cost of processing small packets

(parsing headers, deciding output port) dominates other restrictions– Throughput = packets/sec x bits/packet

• Moving data from inputs to outputs in parallel may increase the aggregate throughput

• Potential bottlenecks– I/O bus bandwidth– Memory bus bandwidth– Processor computing power

11

Bridges and Extended LANs

12

Building Extended LANs

• Traditional LAN– Shared medium (e.g., Ethernet)– Cheap, easy to administer– Supports broadcast traffic

• Problem– Want to scale LAN concept

• Larger geographic area (> O(1 km))• More hosts (> O(100))

– But retain LAN-like functionality

• Solution: bridges

13

Bridges• Connect two or more LANs with a bridge

– Transparently extends a LAN over multiple networks

– Accept & forward strategy (in promiscuous mode)– Level 2 connection (does not add packet header)

A

Bridge

B C

X Y Z

Port 1

Port 2

14

Bridges vs. Switches• Switch

– Receive frame on input port– Translate address to output port– Forward frame

• Bridge– Connect shared media– All ports bidirectional– Repeat subset of traffic

• Receive frame on one port• Send on all other ports

15

Uses and Limitations of Bridges

• Extend LAN concept

• Limited scalability– To O(1,000) hosts

– Not to global networks

• Not heterogeneous– Some use of address, but

– No translation between frame formats

16

Learning Bridges

• Trivial algorithm– Forward all frames on all (other) LAN’s

– Potentially heavy traffic & processing overhead

• Optimize by using address information– “Learn” which hosts live on which LAN

– Maintain forwarding table

– Only forward when necessary (dest. not on same LAN)

– Reduces bridge workload

17

Learning Bridges

• Learn table entries based on source address– Timeout entries to allow movement of hosts

• Table is an optimization; need not be complete• Always forward broadcast frames• Uses datagram or connectionless forwarding

A

Bridge

B C

X Y Z

Port 1

Port 2

Host PortA 1B 1C 1X 2Y 2Z 2

18

Learning Bridges

• Problem– Redundancy (desirable to handle failures, but …)

– Makes extended LAN structure cyclic– Frames may cycle forever

• Solution: spanning tree

B3

A

C

E

DB2

B5

B

B7 K

F

H

B4

J

B1

B6

G

I

19

Spanning Tree

• Subset of forwarding possibilities• All LAN’s reachable, but

• Acyclic• Bridges run a distributed algorithm to

calculate the spanning tree– Select which bridge actively forward– Developed by Radia Perlman of DEC– Now IEEE 802.1 specification– Reconfigurable algorithm

20

Spanning Tree Concept

• LAN’s and bridges make a bipartite graph

• Ports are edges connecting LAN’s to bridges

• Spanning tree required– Connect all LAN’s: all vertices of graph are covered

– Can leave out bridges: all edges may not be covered

21

Spanning Tree Algorithm

• Each bridge has a unique, totally-ordered identifier

• Select bridge with lowest ID as root bridge

22


• Each bridge determines– Direction of shortest path to root

(preferred port)

– For each connected LAN, is it the designated bridge?• Select bridge on each LAN closest to root as

designated bridge

• Use ID (lowest) to break ties)

– Ports connecting LAN’s to designated

bridges called designated ports

23


• All designated bridges forward frames– On all designated ports

– On preferred port (path leading to root)

B3

A

C

E

DB2

B5

B

B7 K

F

H

B4

J

B1

B6

G

I

B2

LAN

Designated port

Preferred port

Designated bridge

24

Distributed Spanning Tree Algorithm

• Bridges exchange configuration messages– ID for bridge sending the message

– ID for what the sending bridge believes to be root bridge

– Distance (hops) from sending bridge to root bridge

25


• Initially, each bridge believes it is the root– Sends a configuration message, and

checks if any received message is better than the current best message

• Each bridge records current best configuration message for each port

26


• Bridges forward configuration messages outward from root bridge i.e., on all designated ports

• Bridge assumes it is designated bridge for a LAN until it learns otherwise

B3

A

C

E

DB2

B5

B

B7 K

F

H

B4

J

B1

B6

G

I

27

Algorithm Details

• In steady state, only designated bridges forward configuration messages– Outward from root bridge, to all

designated ports

– Until they learn they are not designated bridge

28

Algorithm Details

• In steady state, only root generates configuration messages periodically– Timeout restarts algorithm (claiming “I

am root …”)

• Although algorithm is reconfigureable, it is not possible to forward frames over alternative paths

29

Broadcast and Multicast

• Forward all broadcast/multicast frames to all preferred and designated ports– Current practice– Lets hosts decide whether or not to

accept frame

30

Broadcast and Multicast

• Alternative: extend learning to handle groups– Learn when no group members

downstream

– Group members periodically identify themselves

– Accomplished by having each group member sending a frame to the bridge with group address in source field

31

Limitations of Bridges

• Do not scale– Broadcast does not scale

– Spanning tree algorithm does not scale

• Do not accommodate heterogeneity– Only supports networks with same

address formats

32

Limitations of Bridges

• Caution: beware of transparency– Frame drop because of bridge

congestion

– Large and variable latency between two hosts

– Frames may reorder in extended LANs

1

CS716


By Dr. Amir Qayyum

Lecture No. 16

3

Where we are now …

• Understand different ways to move through network (forwarding)– Read signs at each switch (datagram)

– Follow a known path (virtual circuit)

– Carry instructions (source routing)

• Bridge approach to extending LANconcept

4

Where we are now …

• Next–Example of a real network

(ATM)

–How switches are built and contention within switches

5

ATM (Asynchronous Transfer Mode)

• Defined by ATM Forum (formed in Oct. 1991)– Telephone industry (link providers to

build networks)

– Data network industry

• High speed switching technology: right thing at right place at right time ? ? ?

6


• Common in WANs, can also be used in LANs– Competing technology with Ethernet, but

areas of application only partially overlap

• Connection-oriented packet-switched network– Virtual-circuit routing

• Typically implemented on SONET(other physical layers possible)

7


• Signaling (connection setup) Protocol: Q.2931– Discovering routes and allocating

resources at switches

• ATM address format– E.164 and NSAP (Network Service

Access Point)

– Different from MAC addresses

8

ATM Signaling

• Connection setup called signaling (standard Q.2931)

• Route discovery, resource resv, QoS, ...• Send through network

– Request setup circuit– Send setup frame on setup circuit

• Establish locally– No intermediate switch involvement– Requires pre-established virtual path

9

Cell Switching (ATM)

• Fixed length (53 bytes) frames are called cells– 5-byte (header + 1-byte CRC-8) + 48-

byte payload

• Standard defines 3 layers (5 sublayers)– Layers interface to physical media and

to higher layers (e.g., encapsulating variable-length frames)

10

Cell Switching (ATM)

• 2-level connection hierarchy–Virtual circuits

–Virtual paths

•Bundles of virtual circuits

•Travel along common route

•Reduces forwarding information

11

Why Hierarchical Connections ?• Simpler ...• Setup

– New virtual circuits follow virtual path route

• Forwarding– Virtual path identifier (VPI) used between

switches (smaller forwarding table)– VCI used for last hop (to host)

• Rerouting around failures– Change virtual path once vs. 64k virtual

circuits

12

Variable-length Frames …

• Small data is sent in a minimum-sized packet– No need for extraneous padding

• Large file is sent by breaking it into many maximum-sized packets– Low overhead (header to data bytes

ratio), increasing bandwidth efficiency

– Minimize total number of packets sent• Minimize per-packet processing

13

Drawbacks of Fixed-length Frames

• No optimally good fixed length

• Higher overhead for large frames– Must be fragmented

– One header per fragment (less efficient)

• Low utilization for small frames– Must be padded

14

Then Why Fixed-length Frames ?

• Require simpler hardware …

– Facilitates the impl of hardware switches

– Helpful building fast, highly scalable switches

• Easier to build hardware doing simple job: to process known-length frame

• Parallelism in processing stages: lots of switches doing same thing in parallel– Analogy: processor instruction pipeline with

variable-length stages

15

Why Short Frames ?• Better behavior of non-preemptive

queues

• Reduced granularity of preemption– High-priority frame may wait for max-

size frame

• Long frame (4kB) admits long wait

• Short frame limits wait

– Limits end-to-end jitter, or variance in latency

16

Why Short Frames ?

• Shorter queues–Switches typically store and

forward packets

–Cannot send until full packet arrives

–Short frames (fragmentation) allows first part to be sent while remainder arrives

17

Queuing Behavior Examples

• Consider 4kB vs. 53B frames, 100 Mbps link

• Preemption: high-priority frame arrives just as switch starts sending low-priority frame– 4kB: wait for 4096 x 8 / 100 =

327.68 microseconds

– 53B: wait for 53 x 8 / 100 = 4.24 us

18

Queuing Behavior Examples

• Shorter queues: two chunks (or frames) arrive simultaneously at time 0– 4kB: link is idle until all data arrive at

time 327.68 us; 8 kB left to send

– 53B: link nearly fully utilized (waits 4.24us); at time 327.68 microseconds, roughly 4kB left to send

19

Why 53-byte Frames ?

• Telephone community wish: carry voice effectively

• Demands ATM to improves latency for audio data– Voice encoded at 64kbps: 8-bit smpl at 8KHz

– Need full cell’s worth of samples before

sending cell• 1Kbyte cells -> 125ms per cell (human detectable)

• 53 byte cells implies 6 ms of data

20

Why 53-byte Frames ?

– Smaller latency implies no need for echo cancellers

– Audio reconstruction• Expect low rate of cell loss; can interpolate loss (6 ms)

• Compromise: 48 bytes = 32(Europe) + 64(US) / 2

21

ATM Cell Format• User-Network Interface (UNI)

– Host-to-switch format – GFC: Generic Flow Control (still being defined)– VCI/VPI: Virtual Circuit/Path Identifier– Type: management, congestion control, AAL5 (later)– CLP: Cell Loss Priority – HEC: Header Error Check (CRC-8)

• Network-Network Interface (NNI)– Switch-to-switch format– GFC becomes part of VPI field

GFC VPI VCI Type CLP HEC(CRC-8) payload

4 16 3 18 384 (48 bytes)8

22

Segmentation and Reassembly

• Variable-length packets passed to ATM are often larger than 48 bytes– May not fit in the ATM payload

• Fragmentation is required at the source– High-level message into low-level

packets

23

Segmentation and Reassembly

• Destination reassembles the fragments– Transforms fragments back into

the message

• ATM name this procedure as Segmentation and Reassembly (SAR)

24

Segmentation and Reassembly• ATM Adaptation Layer (AAL)

– Application to ATM cell mapping– AAL header contains information for reassembly– AAL1, AAL2 for applications needing guaranteed rate– AAL3/4 designed for variable-length packet data– AAL5 is an alternative standard for packet data

AAL

ATM

AAL

ATM

… …

25

ATM Layers• ATM Adaptation Layer (AAL)

– Convergence Sublayer (CS) supports different application service models

– Segmentation and Reassembly (SAR) supports variable-length frames

• ATM Layer– Handles virtual circuits, cell header

generation, flow control

• Physical layer– Transmission Convergence (TC)

handles error detection, framing– Physical medium dependent (PMD)

sublayer handles encoding

ATM

AALCS

SAR

PHYTC

PMD

26

AAL 3/4• Provides information to allow variable size packets

to be sent in fixed-size ATM cells• Convergence Sublayer Protocol Data Unit (CS-PDU)

– CPI: Common Part Indicator (version field)– Btag/Etag:beginning and ending tags (same)– BAsize: hint on reassembly buffer space to allocate – Length: size of whole PDU

• Segmented into cells: header/trailer + 44-byte data

CPI Btag BAsize payload Pad 0 Etag Length

8 16 0-24 88 < 64 KB 8 16

27

ATM Cell Format for AAL 3/4

• Type (is-start? and is-end? bits)– BOM (10): beginning of message – COM (00): continuation of message– EOM (01): end of message– SSM (11): single-segment message

• SEQ: sequence number (for cell loss/reordering)• MID: multiplexing ID (mux onto virtual circuits)• Length: number of bytes of PDU in this cell

ATM header type seq MID payload length CRC-10

40 4 352 (44 bytes) 62 10 16

28

Encapsulation and Segmentation for AAL3/4

44 bytes 44 bytes 44 bytes <44 bytes

ATM header

AAL header Cell

payload

AAL trailer Padding

CS-PDU header

User data CS-PDU trailer

< 64 KB 4-7 bytes4 bytes

29

AAL 3/4 Summary• Many factors limit the utilization to 83%• Only 44 bytes of data in 53 bytes of cell• CS-PDU encapsulation further reduces

the efficiency (header-to-payload ratio)• Partial filling of last cell also impacts

the effective utilization• Lot of overhead to perform simple

function of segmentation and reassembly …

30

AAL 5• AAL 5 designed as replacement for AAL3/4

• No PDU information in ATM header– Only uses 1 bit of ATM header to mark end of PDU

– Does not require additional per-cell headers/trailers

– No additional layer of multiplexing in a single VC

• CS-PDU contains data with only 8-byte trailer– Uses stronger error correct at PDU level (CRC-32)

– Protection against lost, corrupt and misordered cells is provided by CS-PDU

31

AAL 5 CS-PDU

• CS-PDU Format

– Pad so trailer always falls at the end of ATM cell

– Length: size of PDU (data only)

– CRC-32 (detects missing or misordered cells)

• Cell Format– End-of-PDU bit in Type field of ATM header

0 - 47 2< 64 KB 2 32

data pad reserved length CRC-32

32

Encapsulation and Segmentation for AAL 5

User data

48 bytes 48 bytes 48 bytes

ATM header Cell payload

Padding

CS-PDUtrailer

1

CS716


By Dr. Amir Qayyum

Lecture No. 17

3

Virtual Paths with ATM• Two level hierarchy of virtual connection: 8-bit

VPI and 16-bit VCI– Switches in the public network use 8-bit VPI– Corporate sites use full 24-bit address (VPI + VCI)– Much less connection-state info in switches– Virtual path: fat pipe with bundle of virtual circuits

Public netw ork

Netw ork BNetw ork A

4

Physical Layers for ATM

• ATM may run over several phy media

• ATM was assumed to run over SONET but both are entirely separable entities

• ATM frame boundaries to be correctly identified– Successive 53-byte ATM frames in payload

– SONET overhead byte points to the payload

– Another way is to calculate CRC (5th byte of the cell)

5

ATM and LANs

• ATM grew out of the telephone community and later used for computer communication

• Significant advantage of performance and better scalability of switched over shared media

• No distance limitation in ATM making it a good choice for high-performance LAN backbone

• Point-to-point, long distance Gigabit Ethernet is a competing technology with ATM

6

ATM as a LAN Backbone

• Different from traditional LANs; no native support for broadcast or multicast

E1

H5

H6

H7

H1E3

H2

H4

H3E2

ATM linksEthernet links

Ethernet sw itch

ATM sw itchATM-attachedhost

7

ATM in a LAN

• How to broadcast to all nodes on an ATM LAN ?–Without knowing all the

addresses

–Without setting up VC to all of them

8

ATM in a LAN• Two solutions

– Redesign protocols that consider LAN different from what ATM can provide (e.g. ATMARP)

– Make ATM behave like shared media, without loosing performance advantage of switched media (e.g. LANE)• ATM address is different from a unique

48-bit MAC address

9

Shared Ethernet Emulation with LANE

• All hosts think they are on the same Ethernet

LANE / EthernetAdaptor Card

HH

H

HH

EthernetSwitchATM Switch

LANE / EthernetAdaptor Card

HH

H

HH

EthernetSwitchATM Switch

10

LAN Emulation (LANE) with ATM

• Transparent shared media emulation of ATM

• Adds (not changes) functionality to ATM switches

• Each device needs a global MAC address, as well as an ATM address to establish a VC

11

LAN Emulation (LANE) with ATM

• Devices connect as LAN Emulation Clients (LEC)

• LANE provides Ethernet-like interface to LECs

• Similar solutions for other networks: VPNs on WANs, VLANs on large, switched Ethernets

12

ATM / LANE Protocol Layers

Higher-layerprotocols

(IP, ARP, . . .)

Signalling+ LANE

AAL5

ATM

PHY

ATM

PHY PHY

Higher-layerprotocols

(IP, ARP, . . .)

Signalling+ LANE

AAL5

ATM

Host Switch Host

PHY

Ethernet-likeinterface

13

Clients and Servers in LANE

• LAN Emulation Client (LEC)–Host, bridge, router or switch

• LAN Emulation Server (LES)–Maintains client’s MAC and

ATM addresses

–Maintains ATM address of BUS

14


• LAN Emulation Configuration Server (LECS)– High-level network management

when LEC starts up

– Reachable by preset VC (recall known server port#)

– Maintains mapping of ATM address to LANE type

15


• Broadcast and Unknown Server (BUS)– Emulates broadcast and multicast; critical to LANE– Uses point-to-multipoint VC with all clients

• Servers physically located in one or more devices

H2H1

BUSLESATM netw ork

Point-to-point VC

Point-to-multipoint VCLECS

16

LANE Registration

1. Client contacts LECS on predefined VC, and sends ATM address to it

2. LECS returns LAN type, MTU and ATM address of LES

3. Client signals connection to LES, and registers MAC and ATM addresses with LES

4. LES returns ATM address of BUS5. Client signals connection to BUS6. Bus adds client to point-to-multipoint

VC

ATM Network

LECS

LES BUS

H1 H2

H3

17

LANE Circuit Setup

1. Client (H1) knows destination MAC address of receiver (H2)

2. Client (H1) sends 1st packet to BUS

3. BUS sends address resolution request to LES

4. LES returns ATM address to client (H1)

5. Client (H1) signals connection to H2 for subsequent packets

ATM Network

LECS

LES BUS

H1 H2

H3

Switches: The Intersections

19

The Intersections

Design intersection to accommodate traffic flows

RawalpindiSaddar

Airport

Pir Wadhai

IslamabadZero Point

Rawal Dam

FaizabadFaizabad FlyoverAyub

Park

20

Contention in Switches• Some packets destined for same output

– One goes first– Others delayed or dropped

• Delaying packets requires buffering– Finite capacity, some packets must still drop– At inputs

• Increases/adds false contention• Sometimes necessary

– At outputs– Can also exert “backpressure”

21

Output Buffering

1x6 Switch

x

a

Standard check-in linesCustomer

service

trying to check-inyou Mr. X

writing complaint

letter

Mr. A waiting to

claim refund of Rs.100

22

Input Buffering: Head-of-line Blocking

1x6 Switch

x

a


service

trying to check-in

you

Mr. X writing

complaint letter

Mr. A waiting to

claim refund of Rs.100

agents are standing by !

23

Backpressure

1x6 Switch

x

a


service

trying to check-inyou i

“no more,

please”

propagation delay requires that switch exerts backpressure before buffer is full; thus used

in networks with small propagation delay

24

Backpressure

• Propagation delay requires that switch 2 exert backpressure at high-water mark rather than when buffer completely full

• It is thus typically only used in networks with small propagation delays (e.g., switch fabrics)

Switch 1 Switch 2

“no more, please”

25

Switching Hardware

• Multi-input multi-output device, getting packets from inputs to the outputs as fast as possible

• Performance of a switch is limited by I/O bus bandwidth (each packet traverse twice)– 1Gbps I/O bus can support ten T3 (45 Mbps) links,

three STS-3 (155 Mbps) links, and not even one STS-12 (625 Mbps) link

• Success or failure of a new protocol depend on whether it takes advantage of switch’s

capabilities

26

Switching Fabric• Special-purpose (switching) hardware

• General problem– Connect N inputs to M outputs (NxM switch)

– Often N=M (bidirectional links)

• Design goals– High throughput: want aggregate close to

MIN (sum of inputs, sum of outputs)

– Avoid contention (fabric faster than ports)

– Good scalability:linear size/cost groth in N/M

27

Switching Fabric and Ports

Inputport

Inputport

Inputport

Inputport

Outputport

Outputport

Outputport

Outputport

Fabric

Switchfabric

SwitchFabric

Avoid contention

here

28

Switch: Fabric and PortsFabric has a job to deliver packets to the right output

Inputport

Inputport

Inputport

Inputport

Outputport

Outputport

Outputport

Outputport

FabricSwitchfabric

(with small internal

buffering)

29

Ports and Fabric

• Ports deals with the complexity of the real world– Virtual circuit management is handled in

ports

– Determine outpt port using forwarding tables

• Input port is the first in performance bottlenecks– Header processing and handling packet to

fabric

30

Ports and Fabric

• Buffering is required at ports– Buffer management has profound

impact on performance

– Internal (in fabric) or output buffering is normally used

• Fabric: simply move packets from inputs to outputs

31

Design Goals - Throughput

• An n x m switch can provide max ideal throughput of S = S1 + S2 + ……… + Sn

– Only possible if traffic at inputs is evenly distributed across all outputs

– Sustained throughput higher than link speed of output is not possible

32

Design Goals - Throughput• Variable size packets affect performance

– Some operations have constant overhead per packet

– Switch performs differently for different sizes of packets

– Packet per second (pps) rate is also important

• Most switches are subject to internalcontention– Determine performance under diff traffic loads

33

Design Goals - Throughput

• Traffic models are important to throughput– Arrival time, output port, packet length

– Extremely difficult to achieve accurate models

– Traffic-modeling very successful in telephony

• Designers now expect high range of throughputs– In order to handle a steady stream of 64-byte

packets, a 40Gbps switch need a rate of 78M pps !!!

34

Design Goals - Scalability

• Cost of hardware rises fast with increasing the number of ports n– Adding ports increases hardware & design

complexity

– Scalability in terms of rate of increase in cost

• Design complexity determines maximum switch size– Switch designs run into problems at some

maximum number of inputs and outputs

35

Switch Performance• Avoid contention with buffering

– Use output buffering when possible– Apply backpressure through fabric– Input buffering with “peeking” (non-FIFO

semantics) to reduce head-of-line blocking problems– Drop packets if input buffer overflows

• Good scalability– O(N) ports– Port design complexity O(N) gives O(N2) for switch– Port design complexity O(1) gives O(N) for switch

36

Crossbar (“Perfect”) Switch

• Problem: hardware scales as O(N2)

37

Knockout Switch: Pick L from N

• Problem: what if more than L arrive

1

2

3

4

OutputsInputs

2x2 random selector

delay unit

8-to-4 concentrator

38

Shared Memory Switch

Mux Buffer memory Demux

Writecontrol

Readcontrol

Inputs Outputs

… …

39

Self-Routing Fabrics• Use source routing on “network” within

switch

• Input port attaches output port number as header

• Fabric routes packet based on output port

• Types– Banyan network

– Batcher-Banyan network

– Sunshine switch

40

Banyan Network

• No contention if inputs are sorted and uniqueMSB LSB

Sends 0 bit upSends 1 bit down

41

Banyan Network

• Sends 0 bit up, 1 bit down

001

011

110

111

001

011

110

111

MSB LSB

42

Batcher (Merge Sort) Network

Routing packets through a Batcher network

• Batcher-Banyan Network– Attach the two-back-to-back– Arbitrary unique permutations routed without

contention

7 3

3 7

3 3

6 6

3 1

1 3

6 6

1 1

7 1

1 7

6 6

7 7

Sort Merge Merge

43

Batcher-Banyan Network

sends 1 bit upsends 0 bit down

sends 0 bit upsends 1 bit down

44

Sunshine Switch

• Like a Knockout switch, except

• Recirculates overflow packets i.e., when more than L arrive in one cycle

Delay

Inputs Batcher Trap SelectorOutputs

nnn

n

kk

n + kn + kl banyans

nnn(marks

overflow packets)

1

CS716


By Dr. Amir Qayyum

Lecture No. 18

3

Internet Protocol (IP)• Network protocol for the Internet• Operates on all hosts and routers (routers connect

distinct networks into the Internet)

…

TFTPNVHTTPFTP

UDPTCP

IP

FDDI Ethernet ATM

4

Internetwork

• Concatenation of networks

• Protocol stack

Network 1

Ethernet

Network 3

FDDI

Network 4

Ethernet

R1

R2

R3

H8H2 H3

H1

H4

H5

H6 H7

Network 2

Point-to-

point

R1

H1

TCP

IP

ETH ETH PPP

IP

R2

PPP FDDI

IP

R3

FDDI ETH

IP

H8

TCP

IP

ETH

5

IP Addresses

– 18.10.5.22 host in class A network (MIT)– 130.126.143.254 host in class B network (UIUC)– 192.12.70.111 host in class C network

• More recent classes– Multicast (class D): starts with 1110– Future expansions (class E): starts with 1111

Network Host

7 bits (126 nets) 24 bits (16 million hosts)

0Class A:

Network Host

14 bits (16k nets) 16 bits (64K hosts)

1 0Class B:

Network Host

21 bits (2 million nets) 8 bits (256)

1 1 0Class C:

6

Datagram Format

• 4-bit version (4 for IPv4, 6 for IPv6)

• 4-bit header length (in words, minimum of 5)

• 8-bit type of service (TOS); more or less unused

• 16-bit datagram length (in bytes)

• 8-bit protocol (e.g., TCP=6 or UDP=17)

Version HLen TOS Length

Ident Flags Offset

TTL Protocol Checksum

SourceAddr

DestinationAddr

Options (variable) Pad(variable)

0 4 8 16 19 31

Data

7

Datagram Format

• Fragmentation support– 16-bit packet ID (identifies packet fragments)– 3-bit flags; one bit marks last fragment– 13-bit fragment offset into packet (in 8-byte words)

• 8-bit “time-to-live” (TTL); a hop count until forced destruction of packet


Ident Flags Offset


SourceAddr

DestinationAddr


0 4 8 16 19 31

Data

8

Datagram Format• 16-bit IP checksum on header• 32-bit source IP address• 32-bit destination IP address• Options and padding (variable length)

– Source-based routing (typically disabled)– Record route– Several others


Ident Flags Offset


SourceAddr

DestinationAddr


0 4 8 16 19 31

Data

1

CS716


By Dr. Amir Qayyum

Lecture No. 19

3

Fragmentation and Reassembly Example

H1 R1 R2 R3 H8

ETH IP (1400) FDDI IP (1400) PPP IP (512)

PPP IP (376)

PPP IP (512)

ETH IP (512)

ETH IP (376)

ETH IP (512)

Ident= x Offset= 0

Start of header

0

Rest of header

1400 data bytes

Ident= x Offset= 0

Start of header

1

Rest of header

512 data bytes

Ident= x Offset= 64

Start of header

1

Rest of header

512 data bytes

Ident= x Offset= 128

Start of header

0

Rest of header

376 data bytes

Datagram Forwarding

4

Network Number Next Hop

1 R3

2 R1

3 Interface 1

4 Interface 0

5

Datagram Forwarding

Network # Netmask Next Hop/Port

18.0.0.0 255.0.0.0 1128.32.0.0 255.255.0.0 20.0.0.0 0.0.0.0 3

dest: 18.26.10.0 mask with 255.0.0.0 matched! send to port 1

dest: 128.16.14.0 mask with 255.0.0.0 not matchedmask with 255.255.0.0 not matchedmask with 0.0.0.0 matched! send to port 3

6

ARP Packet Format

TargetHardwareAddr (bytes 2 – 5)

TargetProtocolAddr (bytes 0 – 3)

SourceProtocolAddr (bytes 2 – 3)

Hardware type = 1 Protocol Type = 0x0800

SourceHardwareAddr (bytes 4 – 5)

TargetHardwareAddr (bytes 0 – 1)

SourceProtocolAddr (bytes 0 – 1)

HLen = 48 PLen = 32 Operation

SourceHardwareAddr (bytes 0 – 3)

0 8 16 31

7

ARP in ATM

H2

R

H1

LIS 10

LIS 12

ATM netw ork

10.0.0.2

10.0.0.112.0.0.3

12.0.0.5

8

Internet Control Message Protocol (ICMP)

• IP companion protocol (not necessary)• Handles error and control messages

…

TFTPNVHTTPFTP

UDPTCP

IP

FDDI Ethernet ATM

ICMP

9

ICMP Message• Sent to the source when a node is unable to

process IP datagram successfully• Error messages

– Destination unreachable (protocol, port, or host)– Reassembly failed– IP Checksum failed; or invalid header– TTL exceeded (so datagrams don’t cycle forever)

– Cannot fragment

• Control messages– Echo (ping) request and reply– Redirect (from router to source host, to change

route)

10

Dynamic Host Configuration Protocol- DHCP

• DHCP server is required to provide configuration information to each host– Each host retrieve this information on bootup

• DHCP server can be configured manually, or it may allocate addresses on-demand– Addresses are “leased” for some period of time

• Each host is not configured for DHCP server, it performs a DHCP server discovery– A broadcast discovery message is sent by the host

and a unicast reply is sent by the server

11

DHCP Server

DHCPrelay

DHCPserver

Other networks

Unicast to server

Broadcast

Host

1

CS716


By Dr. Amir Qayyum

Lecture No. 20

3

Virtual Private Networks - VPNC

A

Corporation X private netw ork

B

K L

M

Corporation Y private netw ork

C

A B

K L

M

Physical links

Physical links

Virtual circuits

(a)

(b)

4

IP Tunneling

IP Multicast

IP Multicast

IP Multicast

VU UET

MAJU

IPv4

Data IP

Data IP IP

5

IP Tunnel in VPNs

• Virtual point-to-point link between a pair of nodes separated by many networks

IP header,Destination = 2.x

IP payload

IP header,Destination = 10.0.0.1


IP payload


IP payload

Network 1 R1 Internetwork Network 2R2

10.0.0.1

6

Routing

• A stranger appears and asks “Airport ?”

• Which way do you point ?

Rawalpindi

Airport

Pir WadhaiIslamabad

Rawal Dam

Faizabad

7

Routing Overview

4

3

6

21

9

1

1D

A

FE

B

C

8

Bellman-Ford Algorithm• For every node N

– For each directed pair (B,C)• Is the path B N …C better than B .C ?• Is cost BNdest smaller than previously known?

• For N nodes– Uses an NxN matrix of (distance, successor) values

9

Bellman-Ford Algorithm

• After n iterations, nodes at distance n hops along the shortest path have correct information

6 2

2 5

1 1 1

Source

Destination

A B

C E

Infinity Infinity Infinity

InfinityInfinity

Infinity Infinity 1 Dest

5 DestInfinity

1 Dest3 BInfinity

5 Dest7 E

1 Dest3 B8 C

4 A7 E

8 C 3 B 1 Dest

4 A6 E

10

Dijkstra’s Algorithm

• From node N– Start with S = {N} and one-hop paths from N– Loop n-1 times

• Add closest outside node M to S• For each node P not in S

– Is the path N ..... M ..... P better than N ..... P ?

11

Dijkstra’s Algorithm

1

1

1

2 22

223

3 3

3

4

4

6

6

77

810

9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1

CS716


By Dr. Amir Qayyum

Lecture No. 21

Distance Vector Routing Example

Information in routing table of each node:Iteration 1

At Distance to reach nodenode A B C D E F G A 0 1 1 x 1 1 xB 1 0 1 x x x xC 1 1 0 1 x x xD x x 1 0 x x 1E 1 x x x 0 x xF 1 x x x x 0 1G x x x 1 x 1 0

D

G

A

F

E

B

C



At Distance to reach nodenode A B C D E F G A 0 1 1 2 1 1 2B 1 0 1 2 2 2 xC 1 1 0 1 2 2 2D 2 2 1 0 x 2 1E 1 2 2 x 0 2 xF 1 2 2 2 2 0 1G 2 x 2 1 x 1 0

D

G

A

F

E

B

C



At Distance to reach nodenode A B C D E F G A 0 1 1 2 1 1 2B 1 0 1 2 2 2 3C 1 1 0 1 2 2 2D 2 2 1 0 3 2 1E 1 2 2 3 0 2 3F 1 2 2 2 2 0 1G 2 3 2 1 3 1 0

D

G

A

F

E

B

C

Distance Vector Routing Table

D

G

A

F

E

B

C

Destination Cost NextHop

A 1 A

C 1 C

D 2 C

E 2 A

F 2 A

G 3 A

Distance Vector Routing: Link Failure• F detects that link to G has failed• F sets distance to G to infinity and

sends update to A• A sets distance to G to infinity since

it uses F to reach G• A receives periodic update from C

with 2-hop path to G• A sets distance to G to 3 and sends

update to F• F decides it can reach G in 4 hops

via A

D

G

A

F

E

B

C

Count to Infinity Problem• Link from A to E fails• A advertises distance of infinity to E, but

B and C advertise a distance of 2 to E !• B decides it can reach E in 3 hops;

advertises this to all• A decides it can read E in 4 hops;

advertises this to all• C decides that it can reach E in 5 hops…

• We are counting to infinity …

D

G

A

F

E

B

C

Split Horizon

A B C

D

C : 1 : C

C : 2 : B

C : ∞ : -C : 2 : B

Loop of > 2 nodes fails split horizon !!!

10

Reliable Flooding

X A

C B D

(a)

X A

C B D

(b)

X A

C B D

(c)

X A

C B D

(d)

11

Route Calculation: Dijkstra’s Shortest

Path Algorithm• Let

– N denotes set of nodes in the graph– l (i, j) denotes non-negative cost (weight) for edge (i, j)– s denotes this node– M denotes the set of nodes incorporated so far– C(n) denotes cost of the path from s to node n

M = {s}for each n in N - {s}

C(n) = l(s, n) // calculate cost to each node

while (M != N)M = M union {w} such that C(w) is the minimum for

all w in (N - M)for each n in (N - M)

C(n) = MIN(C(n), C (w) + l(w, n ))

12

Link State Algorithm1. Initialize confirmed with entry for self (cost = 0)2. For newly added node (next), select its LSP3. For each neighbor of next, calculate cost to reach

neighbor as the sum of cost from self to next and from next to neighbor

1. If neighbor is currently in neither confirmed nor tentative, add <neighbor, cost, nexthop> to tentative, where nexthop is the direction to reach next

2. If neighbor is currently in tentative and cost is less than current cost for neighbor , then replace current entry with <neighbor, cost, nexthop>, where nexthop is the direction to reach next

4. If tentative is empty, stop. Otherwise pick entry from tentative with the lowest cost, move it to confirmed and return to step 2.

Route Calculation

At node D

Confirmed list Tentative list

1. (D,0,-)

2. (D,0,-) (C,2,C), (B,11,B)

3. (D,0,-), (C,2,C) (B,11,B)

4. (D,0,-), (C,2,C) (B,5,C), (A,12,C)

5. (D,0,-), (C,2,C), (B,5,C) (A,12,C)

6. (D,0,-), (C,2,C), (B,5,C) (A,10,C)

7. (D,0,-), (C,2,C), (B,5,C), (A,10,C)

D

A

B

C

5 3

211

10

OSPF Routing Protocol

• Authentication of routing messages– Encrypted communication between routers

• Additional hierarchy– Domains are split into areas

– Routers only need to know how to reach every node in a domain

– Routers need to know how to get to the right area

– Load balancing• Allows traffic to be distributed over multiple routes

Original ARPANET Metric• Uniform 56 kbps lines

– Bandwidth equal on every line (hence irrelevant)

– Latency relatively unimportant

• Use queue length as distance (number of packets waiting to use a link)Problems– Uniform bandwidth assumption became invalid

– Latency comparable to 1kB transmission delay on 1.544 Mbps link

How to Make Routing Scale

• Flat versus Hierarchical Addresses

• Inefficient use of hierarchical address space– Class C with 2 hosts (0.78% efficient)

– Class B with 256 hosts (0.39% efficient)

• Still Too Many Networks– Routing tables do not scale

– Route propagation protocols do not scale

Mobile IP

Internetwork

Foreign agent(12.0.0.6)

Mobile host(10.0.0.9)

Home agent(10.0.0.3)

Home network(Network 10)

Sending host

IP tunnel

(Network 12)

Mobile IP - Challenges

• How does the MH get a local IP address ?

• How does the HA intercept a packet that is destined for the MH ?

• How does the HA then deliver the packet to the FA ?

• How does the FA deliver the packet to the MH ?

1

CS716


By Dr. Amir Qayyum

Lecture No. 23

Where we are - What we understand• Concepts of networking & network programming

– Elements of networks: nodes and links– Building a packet abstraction on a link

• Issues related to bits/frames sent on a single link– How to detect transmission errors in a frame after encoding

and framing it– How to simulate a reliable channel (sliding window)– How to arbitrate access to shared media in any network

• Design issues of direct link networks– Functionality of network adaptors

Where we are – What we understand• How switches may provide indirect connectivity

– Different ways to move through a network (forwarding)– Bridge approach to extending LAN concept– Example of a real virtual circuit network (ATM)– How switches are built and contention within switches

• How different networks work together– (Common connecting protocol IP) and its service model– Routing protocols in simple networks

• Next: large networks working together

Internetworking

• Reading: Peterson and Davie, Chapter 4

• Global internets (scale)

• Dealing with global scale– Virtual geography and addresses

– Hierarchical routing

– Multicast traffic

– MPLS

• Future internetworking: IPv6

How Should Hierarchy be Defined ?

• Hierarchy reduces information requirements for forwarding

Internetworking

• Main problems addressed by internetworking– Heterogeneity– Scale

• Bridges scale to level of medium organization

• IP addresses most end-to-end heterogeneity issues

• Today: issues of global scale

Global Internets

• Evolution of Internet structure

• Virtual geographies: networks, domains, and translations

• Problems of scale

• IP address hierarchy evolution

• Notion of autonomous systems (AS’s)

• Common intradomain routing protocols

• Standard interdomain routing protocol

Diameter of the Internet• Connect all nodes of a network in a big ring

– Network diameter will be half the ring circumference

• Adding more links brings down network diameter– Diameter can be quarter of the ring with one link …

How many links should be added to

bring the diameter

below some threshold ?

Internet Structure in Recent Past 1990

• Tree structure, centered around one backbone• Interconnection of thousands of “organizations”

– End users and service providers

• Scalability of routing protocols in global Internet– Network entities are administratively independent

NSFNET backboneStanford

BARRNETregional

BerkeleyPARC

NCAR

UA

UNM

Westnetregional

UNL KU

ISU

MidNetregional

Internet Structure Today• Multiple backbone service providers• Arbitrary graph structure• Different QoS in different backbones (commercial)

Backbone service provider

Peeringpoint

Peeringpoint

Large corporation

Large corporation

Smallcorporation

“Consumer” ISP

“Consumer” ISP

“Consumer” ISP

Smallcorporation

Virtual Geographies

• Internet domain names (understood by people)• IP network numbers (understood by routers)• Internet domain name hierarchy

– ASCII strings separated by periods; typical formats:

• host.domain.domain_type– xyz.carepvtltd.com

• host.subdomain.domain.domain_type– abc.cs.berkeley.edu

• host.domain.domain_type.country– www.case.edu.pk

Virtual Geographies and IP Address

• Routers use IP network numbers

• Original intention:– One network number per physical network

• Closer to real geography (than domain names)• Some jumps, e.g.,

– Divisions of one company– Organizations with network research testbeds

Class Network HostIP

Address

Virtual Geographies and DNS

• Translation from domain name to IP address– Domain to IP: Domain Name Service (DNS)

– IP to domain: relies on host

• Not secure

• Use reverse lookup(domain to IP) for security

• DNS is a hierarchical, (mostly) distributedapplication (more detail later)

Scaling Problems for the Internet• Inefficient address allocation• Too many networks for routing

• Can trade off between these two

• Questions– What network(s) should you allocate to a

company with 1000 machines ?

– What about a company with 200 machines ?

– What about a company with 2 machines that plans to grow rapidly ?

Scaling Problems for the Internet• Pressure primarily on class B networks

– Most companies plan to grow beyond 255 machines– Renumbering is a hassle and can interrupt service– Only around 16,000 class B networks available (14 bit)

• Class B networks aren’t very efficient– Few organizations have O(10,000) machines– More likely a network uses O(1,000) of 65,000 address

• Scaling problems with alternatives– Multiple table entries if class C networks used instead– Protocols do not scale beyond O(10,000) networks

IP Address Hierarchy Evolution• Begin with class-based system

• Subnetting within an organization– Network can be broken into smaller networks– Recognized only within the organization– Implemented by packet-switching– Smaller networks called subnets

Class A: 0 network (7) host (24 bits)

Class B: 1 network (14 bits) host (16 bits)0

Class C: 1 network (21 bits) host (8 bits)1 0

Subnetting

• Another level to address/routing hierarchy: subnet• Subnet masks define variable partition of host part• Subnets visible only within site (close to each other)

Class Network HostIP address

Network HostSubnetted IP address Subnet

1111111111111111 00000000Subnet Mask 11111111

Class

11

1111111111111111 0000Non-

contiguous Mask

111111 11110000

Forwarding table at router R1Subnet Number Subnet Mask Next Hop

128.96.34.0 255.255.255.128 interface 0

128.96.34.128 255.255.255.128 interface 1

128.96.33.0 255.255.255.0 R2

Subnetting Example

All hosts have

address && mask = subnet address

Subnet mask: 255.255.255.128Subnet number: 128.96.34.0

128.96.34.15128.96.34.1

H1R1

128.96.34.130Subnet mask: 255.255.255.128Subnet number: 128.96.34.128

128.96.34.129128.96.34.139

R2H2

128.96.33.1128.96.33.14

Subnet mask: 255.255.255.0Subnet number: 128.96.33.0

H3

Forwarding AlgorithmD = destination IP addressfor each entry (SubnetNum, SubnetMask, NextHop)

D1 = SubnetMask & Dif D1 = SubnetNum

if NextHop is an interfacedeliver datagram directly to D

elsedeliver datagram to NextHop

• Comparisons in parallel, taking the longest match• Use a default router if nothing matches• Not necessary for 1s in subnet mask to be contiguous • Can put multiple subnets on one physical network

Subnetting

• Subnets allow hierarchy within organizations

• But do not reduce class B address pressure

CS716 Advanced Computer Networks - api.ning.com · Advanced Computer Networks By Dr. Amir Qayyum....

Documents

Transcript of CS716 Advanced Computer Networks - api.ning.com · Advanced Computer Networks By Dr. Amir Qayyum....