CS716 Advanced Computer Networks - api.ning.com · Advanced Computer Networks By Dr. Amir Qayyum....
Transcript of CS716 Advanced Computer Networks - api.ning.com · Advanced Computer Networks By Dr. Amir Qayyum....
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 1
3
Instructor’s Introduction
• Ph.D. Mobile Wireless Net., Univ. of Paris-Sud, France– Elect. Engg.,U.E.T. Lahore– M.S. Comp. Engg., E.S.I.M., France– D.E.A. Parallel Comp. Arch., Univ. of Paris-Sud, France
• Active participant of MANET wg of IETF since 1997– Co-author of an RFC on routing protocol for MANETs
• Associated with INRIA, France as a research fellow– Worked on Praxitele, PRIMA and IPANEMA projects
• Protocol implement. experience at Enabling Technologies– Implementation of RTP-RTCP/UDP-TCP/IP stack for a RISC
based packet processor, including CRTP, PPP/MP, IPsec, QoS.
• Currently working at CARE Pvt. Ltd. and CASE– Design/development of long-range, secure mobile ad hoc networks
4
Significance and Rationale of course
• Networks and telecommunication is getting more and more importance
• Future telecomm networks will be more oriented toward “networks” rather than “communication”
• Widespread Internet, diffused in our daily life is a ground reality; its beneficial to understand it
• Its fun to play with protocols (software) and able to design exciting new type of networks
5
Foundation Course in Network Stream
• This Computer Networks course is a foundation course for “Networks” as the Area of Specialization
• It is a required pre-requisite course for the following advanced level courses in networking– Performance Analysis of Communication Networks
– Integrated Services over Packet Networks
– Computer Network Security
– Mobile Networking
• It is a recommended pre-requisite for the course:– Network and System Programming
6
Course Material
• Reference books– Many textbooks on Networking may be consulted
• Lot of research papers!– Many will be made available on course web site
• RFCs and Internet drafts– Related to TCP/IP suite and other protocols
• Web resources– Tutorials, white papers, reports, etc.
7
Text Book
• Larry L. Peterson and Bruce S. Davies, Computer Networks: A Systems Approach– Third Edition [2003], Morgan Kaufmann
Publishers, San Mateo, California, USA
• W. Richard Stevens, UNIX Network Programming, Volume 1, (Networking APIs: Sockets and XTI)– Second Edition, Prentice Hall
– Recommended ONLY for programming part
8
Prerequisites
• Required:
• An undergraduate level course on computer networks
• Good knowledge of C language, preferably in Unix/Linux– e.g. C and Data Structures course
9
Course Evaluation
• Assignments
• Quizzes
• Research Papers
• Term Projects / Reports
• Midterm exam
• Final exam
10
Expectations
What do you want (or expect) to learn from
this course ?
11
Expectations
• This course IS about …
– Network principles and concepts
– General purpose computer networks
– Internet perspective• Major components of the Internet protocol suite
– Network software
– Designing and building a system
12
Expectations
• This course IS NOT about …
– Survey of existing protocol standards
– Specialized networks (e.g. CATV, telephone)
– OSI perspective
– Network hardware• Data transmission on physical layer
– Queuing theory (we do survey, if time permits)
13
Expectations
We will learn
why
networks are like they are
14
Objectives: Principles and Concepts
• At the end of this course, you should be able to:– Identify the problems that arise in networked
communication
– Explain advantages/disadvantages of existing solutionsto these problems in different networking scenarios
– Evaluate novel approaches to these problems
– Understand the components of Internet protocol suite
– Understand the implications of a given solution for performance in various networking environments
15
Objectives: Programming
• At the end of this course, you should be able to:– Identify and describe the purpose of each component
of the TCP/IP protocol suite
– Develop client-server applications using TCP/IP
– Understand the impact of trends in network hardware on network software issues
– Understand over 1000 useful (or useless) VUAs
16
Course Contents
• Overview• Introduction to network programming• Direct link networks• Packet switching• Internetworking• End-to-end protocols• Congestion control and resource allocation• End-to-end data• Applications• Performance analysis and queuing theory (?)
17
Introduction
• Outline– Statistical Multiplexing– Inter-Process Communication– Performance Metrics– Network Architecture
18
What is Your Over-ambitious Goal ?
• Build a computer network which– Can grow to global proportions
– Support diverse applications
• Then … think about
– Underlying building blocks• Which available technologies to use
– Integrating the blocks to communicate• Which software architecture to use
Network Design
Before looking inside a computer network, first agree on what a computer network is?
20
Computer Network ?
• Set of serial lines to attach terminals to mainframe ?
• Telephone network carrying voice traffic ?
• Cable network to disseminate video signals ?
Specialized to handle:
Keystrokes
Voice
Video
21
What Distinguishes a Computer Network ?
• Generality• Built from general purpose
programmable hardware• Supports wide range of applications
22
Applications – Users’ Contact with
the Network• Most people know the Internet through its
applications– Web, email, streaming audio and video, chat, …
• Applications present an intuitively simple interface– Textual and graphical objects– Simple “clicks” to maneuver the application
• However, users are not aware of what happens in the network with their simple “clicks” !!!
23
Applications – Consumers of Networks
• On a simple click, several messages may be exchanged over the Internet
• In a web browser, 17 messages may be exchanged– up to six messages to translate the server name
– three messages to set up a TCP connection
– four messages to send HTTP “get” request + response
– four messages to tear down the TCP connection
• Moreover, millions of messages are exchanged each day by Internet nodes to make their presence and services known
24
Applications – the Driving Force
• Streaming audio and video is an emerging application– Source generates and sends the video stream in messages across
the Internet
• Video-on-demand: reads a preexisting movie– One-way data transfer
• Videoconferencing: interactive session– Very tight timing constraints
• Diversity of applications that can be built on top of the Internet hint at the complexity of the Internet design
25
Our Road Map …
• Fortunately, we are not the first to build a computer network
• Lets start exploring the path that others have already dig deep
• By asking (and answering) why networks are designed the way they are
26
Network Overview• What must a network provide ?
– Connectivity– Cost-effective sharing– Functionality– Performance
• How are networks designed and built ?– Layering– Protocols– Standards
27
Perspective
• For network user– Connectivity: for services required; error
free delivery within acceptable time limits
• For network designer– Efficiency: cost-effective design, fair
allocation and efficient use of resources
• For network operator– Maintenance: easy to administer, fault
localization & isolation, usage accounting
28
Building Blocks
• Nodes: PC, special-purpose hardware…– Hosts– Switches
• Links: coax cable, optical fiber…
– Point-to-point
– Multiple access …
29
Why not connect each node with every other node ?
• Number of computers that can be connected becomes very limited
• Number of wires coming out of each node becomes unmanageable
• Amount of physical hardware/devices required becomes very expensive
• Solution: indirect connectivity using intermediate data forwarding nodes
30
Switched Networks
– Two or more nodes connected by a link
– Circular nodes (switches) implementthe network
– Squared nodes (hosts) use the network
• A network can be defined recursively as...
31
Switched Networks
– Two or more networks connected by one or more nodes: internetworks
– Circular nodes (router or gateway) interconnectsthe networks
– A cloud denotes “any type of independent network”
• A network can be defined recursively as...
32
A Network
A network can be defined recursively as
Two or more nodes connected by a physical link
OR
Two or more networks connected by one or more nodes
33
Components of a Network
34
Switching Strategies
• Circuit switching:carry bit streamsa. establishes a dedicated
circuitb. links reserved for use
by communication channel
c. send/receive bit stream at constant rate
d. example: original telephone network
• Packet switching: store-and-forward messagesa. operates on discrete
blocks of datab. utilizes resources
dynamically according to traffic demand
c. send/receive messages at variable rate
d. example: Internet
35
What next ?
• Hosts are directly or indirectly connected to each other
• Can we now provide host-host connectivity ?
• Nodes must be able to say which host it wants to communicate with
36
Addressing and Routing
• Address: byte-string that identifies a node– Usually unique
• Routing: forwarding decisions– Process of determining how to forward messages to the
destination node based on its address
• Types of addresses– unicast: node-specific
– broadcast: all nodes on a network
– multicast: some subset of nodes on a network
37
Wrap-up
• A network can be constructed from nestingof networks
• An address is required for each node that is reachable on the network
• Address is used to route messages toward appropriate destination
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 2
3
Multiplexing• Physical links/switches must be shared among users
– (synchronous) Time-Division Multiplexing (TDM)
– Frequency-Division Multiplexing (FDM)
L1
L2
L3
R1
R2
R3Switch 1 Switch 2
Multiple flows on a single link
Do you see any problem with TDM / FDM ?
4
Statistical Multiplexing
• On-demand time-division, possibly synchronous (ATM)
• Schedule link on a per-packet basis
• Buffer packets in switches that are contending for the link
• Packets from different sources interleaved on link
…
Do you see any problem ?
5
Statistical Multiplexing• An application needs to break-up its message in packets,
and re-assemble at the receiver• Fair allocation of link capacity: FIFO, round-robin or QoS• If congestion occurs at a switch - buffer may overflow,
packets may be lost
…
6
Enough … ?
• A network is delivering packets among a collection of computers
• How application processes communicate in a meaningful way ?
• Hide network complexity by implementing the common services once
7
Inter-Process Communication• Turn host-to-host connectivity into process-to-process
communication, making the communication meaningful.• Fill gap between what applications expect and what the
underlying technology provides.
Host Host
Application
Host
Application
Host Host
Channel Abstraction for application-level communication
8
IPC Abstractions
• Semantics and interface depend on applications
• Request/Reply– distributed file systems
• file servers (FTP)
– digital libraries / HTTP
• information retrieval
• Message stream– video on-demand
– video conferencing• delay sensitive
• two-way frame flow
• 1/4 NTSC = 352x240 pixels
• (352 x 240 x 24)/8=247.5KB
• 30 fps = 7500KBps = 60Mbps
• 10fps + compression < 10Mbps
• reliable ?
• prioritized ?
• delay/bandwidth guarantees ?
9
Abstract Channel Functionality
• What functionality does a channel provide ?– Smallest set of abstract channel types adequate
for largest number of applications
• Where the functionality is implemented ?– Network as a simple bit-pipe with all high-level
communication semantics at the hosts
– More intelligent switches allowing hosts to be “dumb” devices (telephone network)
10
What Goes Wrong in the Network?
Reliability at stake
• Bit-level errors (electrical interference)• Packet-level errors (congestion)
– distinction between lost and late packet
• Link and node failures– distinction between broken and flaky link– distinction between failed and slow node
11
What Goes Undesirable in the Network?
Required performance at stake
• Messages are delayed
• Messages are delivered out-of-order
• Third party’s eavesdrop
• The challenge is to fill the gap between application expectations and hardware capabilities
12
Performance Metrics
• … and to do so while delivering “good” performance
• Bandwidth (throughput)– data transmitted per unit time, e.g. 10 Mbps– link bandwidth versus end-to-end bandwidth– notation
• KB = 210 bytes• Kbps = 103 bits per second
13
Performance Metrics
1 second
(a)
1 second
(b)
1Mbps, 1µs/bit
2Mbps, 0.5µs/bit
Bandwidth related to “bit width”
14
Performance Metrics• Latency / delay
– time to send message from point A to point B– one-way versus round-trip time (RTT)– components
Latency = Propagation + Transmit + QueuePropagation = Distance / cTransmit = Size / Bandwidth
• Note:• No queuing delay in direct (point-to-point) link• Bandwidth irrelevant if size = 1 bit• Process-to-process latency includes software processing overhead
(dominates over shorter distances)
15
Bandwidth versus Latency
• Relative importance, depends on application• 1-byte character:
– Choice of 1ms vs 100ms dominates 1Mbps vs 100Mbps
• 25MB file:– Choice of 1Mbps vs 100Mbps dominates 1ms vs
100ms
• Large data (file transfer) is bandwidth critical• Small data (HTTP) is latency critical
16
Bandwidth versus Latency10,000
5000
2000
1000
500
200
100
50
20
10
5
2
1
10010RTT (ms)
1-MB object, 1.5-Mbps link
1-MB object, 10-Mbps link
2-KB object, 1.5-Mbps link
2-KB object, 10-Mbps link
1-byte object, 1.5-Mbps link
1-byte object, 10-Mbps linkPer
ceiv
ed la
tenc
y (m
s)
17
Delay x Bandwidth Product
• Amount of data “in flight” or “in the pipe”
• Example: 100ms RTT x 45Mbps BW = 560KB
• This much data must be buffered before the sender responds to slowdown the request
Bandwidth
Delay
18
Infinite Bandwidth
• Latency (RTT) dominates instead of throughput– Throughput = TransferSize / TransferTime– TransferTime = RTT + 1/Bandwidth x TransferSize
• 1 MB file over a 1 Mbps network takes around 8 sec– With RTT of 100ms, it corresponds to 80 RTTs– Effective throughput is 1MB/8.1s = 0.987Mbps
• 1 MB file over a 1 Gbps network takes 100ms + 8ms– Effective throughput is 1MB/108ms = 74.1 Mbps
• 1-MB file to 1-Gbps link apears like a 1-KB packet to 1-Mbps link
19
Bandwidth Requirements
• Request/reply type applications may agree for as much bandwidth as is available
• Message stream type applications often require a certain bandwidth: not more not less
• Average required bandwidth does not always help to design a network– Upper bound on a burst (peak-rate transfer)
• Queuing delay introduces variation in latency (jitter)
20
Wrap-up
• Established a comprehensive set of requirements for network design
• Networks evolve to accommodate changes in underlying technologies and user demands
• However, hardware and user expectations are moving targets …
21
Network Architecture
• The challenge is to fill the gap between hardware capabilities and application expectations, and to do so while delivering “good” performance
• Designers cope with this complex task by developing a network architecture as a guideline– Layering, protocols, standards
22
Layering• Use abstractions to hide complexity• Abstractions naturally lead to layering• Each layer provides some functionality
Hardware
Host-to-host connectivity
Process-to-process channels
Application programs
23
Layering• Alternative abstractions at each layer• Manageable network components• Modify layers independently
Hardware
Host-to-host connectivity
Application programs
Request/replychannel
Message streamchannel
24
Protocols
• Building blocks of a network architecture
• Each protocol object has two different interfaces– service interface: operations on this protocol
– peer-to-peer interface: messages exchanged with peer
• Term “protocol” is overloaded
– specification of peer-to-peer interface– module that implements this interface– peer modules are interoperable if both accurately
follow the specifications
25
Host 1 Host 2
Service
interface
Peer-to-peer
interface
Protocol Interfaces
High-levelobject
High-levelobject
ProtocolProtocol
26
Protocol Graph – Network Architecture• Collection of protocols and their dependencies
– most peer-to-peer communication is indirect– peer-to-peer is direct only at hardware level
Host 1 Host 2
Fileapplication
Digitallibrary
application
Videoapplication
Fileapplication
Digitallibrary
application
Videoapplication
RRP RRPMSP MSP
HHP HHP
RRP: Request Reply Protocol
MSP: Message Stream Protocol
HHP: Host-to-Host Protocol
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 3
Protocol Machinery
Multiplexing and Demultiplexing (demux key)Encapsulation (header/body) in peer-to-peer interfaces
indirect communication (except at hardware level)each protocol adds a header
part of header includes demultiplexing field (e.g., pass up to request/reply or to message stream?)
4
Encapsulation
Host 1 Host 2
Applicationprogram
Applicationprogram
Data Data
RRP RRP
RRP Data RRP Data
HHP HHP
RRP DataHHP
5
Message Transmission Using Layers
6
Standard Architectures
• Open System Interconnect (OSI) Architecture– International Standards Organization (ISO)
– International Telecommunications Union (ITU), formerly CCITT
– “X dot” series: X.25, X.400, X.500
– Primarily a reference model
7
OSI Architecture
Application
Presentation
Session
Transport
End host
One or more nodeswithin the network
Network
Data link
Physical
Network
Data link
Physical
Network
Data link
Physical
Application
Presentation
Session
Transport
End host
Network
Data link
Physical
Application
Data formatting
Connection management
Process-to-process communication channel
Host-to-host packet delivery
Framing of data bits
Transmission of raw bits
User level
OS kernel
8
Internet Architecture
• TCP/IP Architecture– Developed with ARPANET and NSFNET
– Internet Engineering Task Force (IETF)
• Culture: implement, then standardize
• OSI culture: standardize, then implement
– Became popular with release of Berkeley Software Distribution (BSD) Unix; i.e. free software
– Standard suggestions traditionally debated publically through “Request For Comments” (RFC’s)
9
Internet Architecture
• Implementation and design done together• Hourglass Design (bottleneck is IP)• Application vs Application Protocol (FTP, HTTP)
… NETnNET2NET1
IP
TCP UDP
FTP HTTP NV TFTP
10
Internet Architecture
• Layering is not very strict
Application
TCP UDP
IP
Network
11
Network Models
12
How Layers Fit Together in Practice
13
Networking in the Internet Age
14
Protocol Acronyms• (T)FTP – (Trivial) File Transfer Protocol
• HTTP – Hyper Text Transport Protocol
• NV – Network Video
• SMTP – Simple Mail Transfer Protocol
• NTP – Network Time Protocol
• TCP – Transmission Control Protocol
• UDP – User Datagram Protocol
• IP – Internet Protocol
• FDDI – Fiber Distributed Data Interface
• ATM – Asynchronous Transfer Mode
15
Elements of a Protocol Implementation
• Outline– Service Interface– Process Model– Common Subroutines– Example Protocol
16
Network Software
• Major factors for runaway success of the Internet:– most functionalities provided by software running on
general-purpose computers
• new services can be added readily with just a small matter of programming
• Understanding how to implement network software is essential to understand computer networks
17
Network Application Programming Interface (API)
• Interface that the OS provides to its networking subsystem– most network protocols are implemented in software
– all systems implement network protocols as part of the OS
– each OS is free to define its own network API
– applications can be ported from one OS to another if APIs are similar
• *IF* application program does not interact with other parts of the OS other than the network (file system, fork processes, display …)
18
Protocols and API
• Protocols provide a certain set of services
• API provides a syntax by which those services can be invoked
• Implementation is responsible for mapping API syntax onto protocol services
19
Socket API• Use sockets as “abstract endpoints” of
communication• Issues
– Creating & identifying sockets– Sending & receiving data
• Mechanisms– UNIX system calls and library routines
socket
process
20
Socket API
• Creating a socketint socket(int domain, int type, int
protocol)
• domain (family) = AF_INET, PF_UNIX, AF_OSI
• type = SOCK_STREAM, SOCK_DGRAM
• protocol = TCP, UDP, UNSPEC
• return value is a handle for the newly created socket
21
Sockets (cont)
• Passive Open (on server)int bind(int socket, struct sockaddr *addr, int
addr_len)
int listen(int socket, int backlog)
int accept(int socket, struct sockaddr *addr, int addr_len)
• Active Open (on client)int connect(int socket, struct sockaddr *addr,
int addr_len)
22
Sockets (cont)
• Sending Messagesint send(int socket, char *msg, int mlen, int flags)
• Receiving Messagesint recv(int socket, char *buf, int blen, int flags)
23
Protocol-to-Protocol Interface• A protocol interacts with a lower level
protocol like an application interacts with underlying network
• Why not using available network APIs for PPI ?– Inefficiencies built into the socket interface
• application programmer tolerate them to simplify their task
– inefficiency at one level
• protocol implementers do not tolerate them– inefficiencies at several layers of protocols
24
Protocol-to-Protocol Interface Issues
• Configure multiple layers– static versus extensible
• Process Model– avoid context switches
• Buffer Model– avoid data copies
25
Process Model
(a) (b)Process-per-Protocol Process-per-Message
inter-process communication
procedure call
26
Process Model (cont.)
IP
TCP
send(IP, message) deliver(TCP, message)
int send(Protocol llp, Msg *message)
int deliver(Protocol hlp, Msg *message)
27
Buffer Model
send()deliver()
Topmost protocol
Application process
buffer copy buffer copy
28
Message Library
• Add header
abcdefg
bcopy (“ xyz” , hdr , 3);msgAddHdr(m, hdr, 3);
xyzabcdefg
m
m
abcdefg
hdr = msgStripHdr(m, 3);
defg + hdr = “ abc”
m
m
• Strip header
29
Message Library (cont)
• Fragment message
new
m
m
abcdefg
defg + abc
msgFragment (m, new, 3);
m1 m2
new
abcd efg
msgReassemble(new, m1, m2)
• Reassemble messages
abcdefg
30
Network Programming
• Things to learn– Internet protocols (IP, TCP, UDP, …)
– Sockets API (Application Programming Interface)
• Why IP and sockets• allows a common name space across most of
Internet
– IP (Internet Protocol) is standard
• reduces number of translations, which incur overhead
– Sockets: reasonably simple and elegant Unix interface (most servers run Unix)
31
OSI Model Internet Protocols
…physical
data link
network
transport
application layers TFTPNVHTTPFTP
UDPTCP
IP
FDDI Ethernet ATM
32
• learn to use Internet for communication (with focus on implementation of networking concepts)
• learn to build network from ground up
physical
data link
network
transport
Programming
Principles and concepts
33
Socket Programming
• Reading: Stevens 2nd edition, Chapter 1-6
• Sockets API: a transport layer service interface– introduced in 1981 by BSD 4.1
– implemented as library and/or system calls– similar interfaces to TCP and UDP
– can also serve as interface to IP (for super-user); known as “raw sockets”
– Linux also provides interface to MAC layer (for super-user); known as “data-link sockets”
34
Outline
• Client-server model
• TCP connections
• UDP services
• Addresses and data
• Sockets API
• Example of usage
35
Client-Server Model
• Asymmetric relationship
• Server/daemon– well-known name
– waits for contact
– process requests, sends replies
• Client– initiates contact
– waits for response
server
client client
client
36
Client-Server Model• Bidirectional communication channel• Service models
– sequential: server processes only one client’s requests at a time
– concurrent: server processes multiple clients’ requests simultaneously
– hybrid: server maintains multiple connections, but processes requests sequentially
• Server and client categories not disjoint– server can be client of another server– server as client of its own client (peer-to-peer architecture)
37
Review Lecture 3
• Protocol Machinery - Encapsulation
• Standard architectures: OSI, Internet
• Network models and layering
• Elements of protocol implementation
• Network software
• Network API – Socket API – PPI
• Process and buffer model
• Programming – Client server model
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 4
3
TCP Connections
• Transmission Control Protocol, at OSI transport layer
• Recall: each protocol provides service interface
4
Aspects of TCP Service
• Transfers a stream of bytes (interpreted by application)
• Connection-oriented– set up connection before communicating
– tear down connection when done
• In-order delivery of data: if A sends M1 followed by M2 to B, B never receives M2 before M1
5
Aspects of TCP Service
• Reliable– data delivered at most once
– exactly once if no catastrophic failures
• Flow control– prevents senders from wasting
bandwidth
– reduces global congestion problems
6
Aspects of TCP Service
• Full-duplex: send or receive data at any time
• 16-bit port space allows multiple connections on a single host
7
TCP Connections
• TCP connection setup via 3-way handshake– J and K are sequence numbers for messages
client server
SYN J
SYN K
ACK J+1
ACK K+1Hmmm …
RTT is important!
8
TCP Connections
• TCP connection teardown (4 steps) (either client or server can initiate connection teardown)
client server
FIN J
FIN K
ACK K+1
ACK J+1
active close
passive close
closes connection
Hmmm …
Latencymatters!
9
UDP Services
• User Datagram Protocol, at OSI transport layer
• Thin layer over IP
10
UDP - Aspects of Services
• Unit of transfer is a datagram (variable length packet)
• Unreliable, drops packets silently
• No ordering guarantees
• No flow control
• 16-bit port space (distinct from TCP ports) allows multiple recipients on a single host
11
Addresses and Data
• Internet domain names: human readable– mnemonic
– variable length• e.g., mail.yahoo.com, www.vu.edu.pk (FQDN)
• IP addresses: easily handled by routers/computers– fixed length
– tied (loosely) to geography• e.g., 128.93.0.4 or 212.0.0.1
12
Endianness
• Machines on Internet have different endianness
• Little-endian (Intel, DEC): least significant byte of word stored in lowest memory address
• Big-endian (Sun, SGI, HP): most significant byte...
13
Endianness
• Network byte order is big-endian
• Use of network byte order– imperative for some data (e.g., IP addresses)
– good form for all binary data (e.g., application-specific)
– ASCII/Unicode are acceptable alternatives
14
Endianness• 16/32 bit conversion (for platform independence)
int m, n; // int32
short int s, t; // int16
m = ntohl(n) // net-to-host long (32-bit) translation
s = ntohs(t) // net-to-host short (16-bit) translation
n = htonl(m) // host-to-net long (32-bit) translation
t = htons(s) // host-to-net short (16-bit) translation
15
Socket Address Structures• Socket address structures (all fields in network byte order
except sin_family)
IP addressstruct in_addr {
in_addr_t s_addr; /* 32-bit IP address */
};
TCP or UDP addressstruct sockaddr_in {
short sin_family; /* e.g., AF_INET */
ushort sin_port; /* TCP / UDP port */
struct in_addr; /* IP address */
};
16
Address Conversion• All binary values used and returned by these functions
are network byte ordered
struct hostent* gethostbyname (const char* hostname);
translates English host name to IP address (uses DNS)
struct hostent* gethostbyaddr (const char* addr, size_t len, int family);
translates IP address to English host name (not secure)
int gethostname (char* name, size_t namelen);
reads host’s name (use with gethostbyname to find local IP)
17
Address Conversionin_addr_t inet_addr (const char* strptr);
translate dotted-decimal notation to IP address; returns –1 on failure, thus cannot handle broadcast value “255.255.255.255”
int inet_aton (const char* strptr, struct in_addr inaddr);
translate dotted-decimal notation to IP address; returns 1 on success, 0 on failure
char* inet_ntoa (struct in_addr inaddr);
translate IP address to ASCII dotted-decimal notation (e.g., “128.32.36.37”); not thread-safe
18
Sockets API
• Basic Unix concepts
• Creation and setup
• Establishing a connection (TCP only)
• Sending and receiving data
• Tearing down a connection (TCP only)
• Advanced sockets
19
Basic UNIX Concepts – I/O
• Per-process table of I/O channels
• Table entries can describe files, sockets, devices, pipes, etc.
• Unifies I/O interface
• Table entry/index into table called “file descriptor”
20
Basic UNIX Concepts
• Error model
• “standardization” of return value
– 0 on success, -1 on failure
– NULL on failure for routines returning pointers
•errno variable
21
Client-Server Connection
client server
my-machinemail.yahoo.com
I am mail.yahoo.com,
port b
I accept connections
I will talk to my-machine,
port a
Talk to mail.yahoo.com,
port b
Resulting TCP connection identified by(my-machine:a, mail.yahoo.com:b)
22
Client-Server ConnectionI am
mail.yahoo.com, port b
I accept connections
I will talk to my-machine,
port a
Talk to mail.yahoo.com,
port b1. socket()2. bind()
3. listen()
4. socket()5. connect()
6. accept()
7. send() / sendto()8. recv() / recvfrom()
9. close() / shutdown()
client server
my-machinemail.yahoo.com
23
Socket Creation and Setup• int socket (int family, int type, int protocol);
Create a socket. Returns file descriptor or -1.
• int bind (int sockfd, struct sockaddr* myaddr, int addrlen);Bind a socket to a local IP address and port number.
• int listen (int sockfd, int backlog);Put socket into passive state (wait for connections rather than initiate
a connection).
24
Creating Sockets - socket()int socket (int family, int type, int protocol);
Create a socket. Returns file descriptor or -1. Also sets errno on failure.
family: address family (namespace) or protocol family– AF_INET for IPv4– other possibilities: AF_INET6 (IPv6), AF_UNIX, AF_OSI or
AF_LOCAL (Unix socket), AF_ROUTE (routing)
type: style of communication– SOCK_STREAM for TCP (with AF_INET)– SOCK_DGRAM for UDP (with AF_INET)
protocol: protocol within family– Usually already defined by domain & type, typically 0 (default)
25
Naming and Identifying Sockets - bind()int bind (int sockfd, struct sockaddr* myaddr, int
addrlen);
Bind a socket to a local IP address and port number. Returns 0 on success, -1 and sets errno on failure.
sockfd: socket file descriptor (returned from socket)myaddr: includes IP address and port number
– IP address: set by kernel if value passed is INADDR_ANY, else set by caller
– port number: set by kernel if value passed is 0, else set by caller
addrlen: length of address structure = sizeof (struct sockaddr_in)
26
TCP and UDP Port Namespaces• Allocated and assigned by the Internet Assigned
Numbers Authority (IANA)– see RFC 1700– ftp://ftp.isi.edu/in-notes/iana/assignments/port-
numbers• 1-512 standard services (see /etc/services); super-
user only• 513-1023 registered and controlled, also used for
identity verification; super-user only• 1024-49151 registered services/ephemeral ports• 49152-65535 private/ephemeral ports
27
Waiting for Connections - listen()
int listen (int sockfd, int backlog);
Put socket into passive state (wait for connections rather than initiate a connection). Returns 0 on success, -1 and sets errno on failure.
sockfd : socket file descriptor (returned from socket )backlog : bound on length of un-accept()ed connection
queue (connection backlog); kernel will cap, thus better to set high
28
Contact the Peer - connect()
int connect (int sockfd, struct sockaddr* servaddr, int addrlen);
Connect to another socket. Returns 0 on success, -1 and sets errno on failure.
sockfd : socket file descriptor (returned from socket )servaddr : IP address and port number of serveraddrlen : length of address structure = sizeof (struct
sockaddr_in)
Can use with UDP to restrict incoming datagrams and to obtain asynchronous errors
29
Welcome a Connection - accept()int accept (int sockfd, struct sockaddr* cliaddr,
int* addrlen);
Accept a new connection (first one of the queue of pending connections). Returns file descriptor or -1. Also sets errno.
sockfd : socket file descriptor (returned from socket )cliaddr : IP address and port number of client (returned from
call)addrlen : length of address structure = pointer to int set to
sizeof (struct sockaddr_in)– addrlen is a value-result argument: the caller passes the size of
the address structure, the kernel returns the size of the client’s address (the number of bytes written)
30
Sending and Receiving data
31
Send the Data - write()int write (int sockfd, char* buf, size_t nbytes);
Write data to a stream (TCP) or “connected” datagram (UDP) socket. Returns number of bytes written or -1. Also sets errno on failure.
sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to write• some reasons for failure or partial writes:
– process received interrupt or signal– kernel resources unavailable (e.g., buffers)
int send (int sockfd, char* buf, size_t nbytes , int flags);
32
Receive the Data - read()int read (int sockfd, char* buf, size_t nbytes);
Read data from a stream (TCP) or “connected” datagram (UDP)
socket. Returns number of bytes read or -1. Also sets errno on failure. Returns 0 if socket closed.
sockfd : socket file descriptor (returned from socket )buf : data buffer
nbytes : number of bytes to try to read
int recv (int sockfd, char* buf, size_t nbytes , int flags);
33
Send Data to Someone - sendto()int sendto (int sockfd, char* buf, size_t nbytes,
int flags, struct sockaddr* destaddr, int addrlen);
Send a datagram to another UDP socket. Returns number of bytes written or -1. Also sets errno on failure.
sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to readflags : see man page for details; typically use 0destaddr : IP address and port number of destination socketaddrlen : length of address structure = sizeof (struct
sockaddr_in)
34
Receive Data from Someone - recvfrom()int recvfrom (int sockfd, char* buf, size_t nbytes,
int flags, struct sockaddr* srcaddr, int* addrlen);
Read a datagram from a UDP socket. Returns number of bytes read (0 is valid) or -1. Also sets errno on failure.
sockfd : socket file descriptor (returned from socket )buf : data buffernbytes : number of bytes to try to readflags : see man page for details; typically use 0srcaddr : IP address and port number of sending socket
(returned from call)addrlen : length of address structure = pointer to int set to
sizeof (struct sockaddr_in)
35
Tearing Down a Connection
36
Good Bye - close()int close (int sockfd);
Closes a socket and deletes descriptor from system tables. Returns 0 on success, -1 and sets errno on failure.
sockfd : socket file descriptor (returned from socket )
• Closes communication on socket in both directions. All data sent before close are delivered to other side (although this aspect can be overridden).
• After close() , sockfd is not valid for reading or writing.
37
Close in My Way - shutdown()int shutdown (int sockfd, int howto);
Force termination of communication across a socket in one or both directions. Returns 0 on success, -1 and sets errno on failure.
sockfd : socket file descriptor (returned from socket )howto :
– SHUT_RD to stop reading– SHUT_WR to stop writing– SHUT_RDWR to stop both
• shutdown() overrides the usual rules regarding duplicated sockets, in which TCP teardown does not occur until all copies have closed the socket.
38
Advanced Sockets
• Managing multiple connections– fork()/exec(): multiple server processes
– pthread_create(): multi-threaded server process
– (no calls): event-based server process
• Detecting data arrival– select() and poll() functions
• Synchronous vs. asynchronous connections
• Other socket options
39
Example of Use
• Taken from Beej’s Guide to Network
Programming (see the course web page)
• Client-server example using TCP
• For each client– Server forks new process to handle connection
– Sends “Hello, world”
40
TCP Connection
clientserver
SYN J
SYN K
ACK J+1
ACK K+1
socket
bind
listen
accept
socket
connect connection added to incomplete queue
connection moved to complete queue
connect completes
41
TCP Connection
clientserver
socket
bind
listen
accept
socket
connect
write
read
writeread
closeclose
42
UDP Connection
client
server
socket
bindsocket
sendto
recvfrom
sendto
recvfrom
close
43
Food for Thought
Framing messages on a byte stream … ?
• Problem– pass logical messages using a TCP connection– read() may return partial or multiple messages– how can receiver identify the end of a message?
• Try to come up with two or three methods• Hints
– string storage in C and Pascal– format strings with printf()
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 5
3
The Big Picture
You are
here
Midterm exam
(estimated)
4
What We Know
• Networks are– Experiencing explosive growth– Providing wide range of services
• It is attributed to:– General purpose nature of computer networks
– Ability to add new functionality with software– High performance computers are now
affordable
5
and We Know …
• Connecting mainframes over long-distance telephone lines has turned into a big business!
• Lots of competing players– Computing industry
– Telephone carriers
– Service providers, operators, …
• Global, ubiquitous, heterogeneous networking ?– Issues of connectivity, service levels, performance,
…
6
What We Have Learned
• Carefully identify what we expect from a network
• Cost-effective connectivity– Accomplished through nested interconnection
of nodes and links
– Provides process-to-process communication services
– Should offer high performance using the metrics like latency and throughput
• This results in a packet-switched network
7
What is Our Approach
• A layered architecture as a guideline for design
• Protocols are central objects– Provides services to higher-level protocols
– Make a message exchange meaningful with peers
• Implement protocols in software– Define interfaces to invoke services
– Socket interface between applications and protocols
– “Similar” interface within the network subsystem
8
What Next ?
Start with a simplest possible network
Two nodes connected directlythrough some suitable medium
9
Point-to-Point Links
Reading: Peterson and Davie, Ch. 2
OutlineHardware building blocksEncodingFramingError DetectionReliable transmission
• Sliding Window Algorithm
10
Direct Link Issues in the OSI and Hardware/Software Contexts
transport
network
data link
physical
session
presentation
application
user-level software
kernel software (device drivers)
reliability
framing, error detection, MAC
encoding hardware (network adapter)
11
Hardware Building Blocks
• Nodes– Hosts: general-purpose computers
– Switches: typically special-purpose hardware
– Routers (connecting networks): varies
• Links– Copper wire with electronic signaling
– Glass fiber with optical signaling
– Wireless with electromagnetic (radio, infrared, microwave) signaling
12
Nodes – A Workstation Architecture
CPU(processor)
Cache $
MemoryI/O bus
Networkadaptor
to network
finite memory (implies limited
buffer space)
Device driver managing network adaptor which is
using system’s I/O bus
Memory access much slower
than CPU speed
memory bus
13
Links
• Physical media– twisted pair cable
– coaxial cable
– optical fiber
– space
• Media is used to propagate signals
• Signals are electromagnetic waves of certain frequency, traveling at speed of light
14
Electromagnetic Spectrum
Radio Inf rared UVMicrow ave Gamma ray
f (Hz)
FM
Coax
Satellite
TV
AM Terrestrial microw ave
Fiber optics
X ray
100
104 105 106 107 108 109 1010 1011 1012 1013 1014 1015 1016
102 106 108 1010 1012 1014 1016 1018 1020 1022 1024104
Wavelength = speed/frequency= 2 x 108 / 300= 667 meters
15
Signals Over a Link
• Signal is modulated for transmission– varying frequency/amplitude/phase to
receive distinguishable signals
• Binary data (0s and 1s) is encoded in a signal– make it understandable by the receiving
host
16
Bits Over a Link
• Bit streams may be transmitted both ways at a time on a point-to-point link– full-duplex
• Sometimes two nodes must alternate link usage– half duplex
17
Which Link to Use ?
• Cables– same room / building / site
Cable Typical Bandwidths Distances
Cat-5 twisted pair 10-100 Mbps 100 m
Thin-net coax 10-100 Mbps 200 m
Thick-net coax 10-100 Mbps 500 m
Multimode fiber 100 Mbps 2 km
Single-mode fiber 100-2400 Mbps 40 km
insulation
braided conductor
copper core
coax
twisted pair
glass core (fiber)
glass clading
plastic jacket
18
Leased Lines
• Across city / country
• Dedicated link from the telephone company
• Appears, but may not be a single link !!!
Service: DS1/T1 DS3 STS-1 STS-3 STS-12 ... STS-48
Bandwidth: 1.5M 44.7M 51.8M 155M 622M ... 2.5G
(bps)
19
Last-mile Links
• Most economical
• Home to network service provider
• To take benefit of an existing network
Service: POTS ISDN xDSL CATV
Bandwidth: 28.8 - 56 K 64 - 128 K 16 K - 55.2 M 20 - 40 M
(bps)
20
ADSL(Asymmetric Digital Subscriber Line)
• Connects the subscriber to the central office via the local loop
• Bandwidth depends on length of local loop
Centraloffice
Subscriberpremises
1.554– 8.448 Mbps
16– 640 Kbps
Local loop2.74 – 5.48 Km
21
VDSL(Very high data rate DSL)
• Connects the subscriber to the optical network that reaches the neighborhood
• Runs over short distances
• Symmetric
Centraloffice
Neighborhood opticalnetwork unit
STS-N
over fiber
Subscriberpremises
VDSL at 12.96– 55.2 Mbps
over 1000– 4500 feet of copper
22
CATV
• Uses existing cable TV (CATV) infrastructure– reaches 95% of households in U.S.
• Single CATV channel has bandwidth of 6 MHz
• Can be used in asymmetric way
• Currently achieves on a single channel:– 40 Mbps downstream (100 Mbps theoretical capacity)
– 20 Mbps upstream
• Multiple access on shared channel (IEEE 802.14)
23
Optical Communication
• Higher bandwidths
• Superior attenuation properties
• Immune from electromagnetic interference
• No cross-talk between fibers
• Thin, lightweight and cheap (the fiber, not the optical-electrical interfaces)
24
Wireless Links• Satellite links
• Provide a grid of medium and low orbit satellites– Geosynchronous satellite 600-1000 Mbps
– Low Earth Orbit (LEO) array ~400 Mbps
• Targeted at voice communication modems
• Teledesic supports 1440 16 kbps satellite-to-earth channels (~2 Mbps); 155.5 Mbps intersatellite channels
25
Wireless Links• Radio and infra-red frequency links
• 11 Mbps rates, 2.4 GHz band, distances of 50-150 meters– 5.2 GHz band, > 55 Mbps: HIPERLAN-1, IEEE
802.11a
• Bluetooth piconets: Infrared links, 1 Mbps, 10 meters
26
Encoding
27
Point-to-Point Links
• Reading: Peterson and Davie, Ch. 2
• Hardware building blocks• Encoding• Framing• Error Detection• Reliable transmission
– Sliding Window Algorithm
28
Encoding
• Signals propagate over a physical medium– modulate electromagnetic waves
– e.g., vary voltage
• Encode binary data onto signals that propagate
Signalling component
Signal
Bits
Node NodeAdaptor Adaptor
29
Encoding
• Problems with signal transmission– Attenuation: signal power absorbed by medium
– Dispersion: a discrete signal spreads in space
– Noise: random background “signals”
modulator demodulatora string
of signals
Digital data (a string of symbols)
Digital data (a string of symbols)
30
Advantages of Digital Transmission over Analog
• Reasonably low-error rates over arbitrary distances– Calculate/measure effects of transmission
problems
– Periodically interpret and regenerate signal
• Simpler for multiplexing distinct data types (audio, video, e-mail, etc.)
31
Advantages of Digital Transmission over Analog
• Examples of modulators-demodulators (modems)
• Electronic Industries Association (EIA) standard RS-232(-C)
• International Telecommunications Union (ITU) standard V.32 96 kbps modem
32
RS-232(-C)
• Communication between computer and modem
• Uses two voltage levels (+15V, -15V), a binary voltage encoding
• Data rate limited to 19.2 kbps (RS-232-C); raised in later standards
33
RS-232(-C)
• Characteristics
• Serial: one signaling wire, one bit at a time
• Asynchronous: line can be idle, clock generated from data
• Character-based: send data in 7- or 8-bit characters
34
RS-232 Timing Diagram
+15
-15
volt
age
Idle start 1 0 0 1 1 0 0 stop idle
time
35
RS-232
• One bit per clock
• Voltage never returns to 0V (0V is a dead / disconnected line)
• -15V is both idle and “1”; initiates the
send by pushing to 15V for one clock (start bit)
36
RS-232
• Minimum delay between character transmissions idle for one clock at –15V (stop bit)
• One character leads to 2+ voltage transitions
• Total of 9 bits for 7 bits of data (78% efficient)
• Start and stop bits also provide framing
37
Binary Voltage Encoding
• NRZ (non-return to zero)
• NRZI (NRZ inverted)
• Manchester (used by IEEE 802.3, 10 Mbps Ethernet)
• 4B/5B (8B/10B) in Fast Ethernet
38
Non-Return to Zero (NRZ)
• Encode binary data onto signals– e.g., 0 as low signal and 1 as high signal
– voltage does not return to zero between bits
• known as Non-Return to Zero (NRZ)
Bits
NRZ
0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0
39
Problem: Consecutive 1s or 0s
• Low signal (0) may be interpreted as no signal• High signal (1) leads to baseline wander• Unable to recover clock
– sender’s and receiver’s clock have to be precisely
synchronized
– receiver resynchronizes on each signal transition
– clock drift in long periods without transition
sender’s clock
receiver’s clock
40
Alternative Encodings• Non-Return to Zero Inverted (NRZI)
• Make a transition from current signal(switch voltage level) to encode/transmit a “one”
• Stay at current signal (maintain voltage level) to encode/ transmit a “zero”
• Solves the problem of consecutive ones (shifts to 0s)
41
Alternative Encodings• Manchester (in IEEE 802.3 – 10 Mbps
Ethernet)
• Split cycle into two parts– Send high--low for “1”, low--high for “0”
– Transmit XOR of NRZ encoded data and the clock
• Only 50% efficient (1/2 bit per transition)
42
Different Encoding Schemes
Bits
NRZ
Clock
Manchester
NRZI
0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0
43
4B/5B Encoding
• Every 4 consecutive bits of data encoded in a 5-bit code (symbol)– 4-bit pattern is “translated” to a 5-bit pattern (not addition)
• 5-bit codes selected to have no more than one leading 0 and no more than two trailing 0s– 00xxx (8 symbols) and xx000 (4 symbols) are illegal– 5 free symbols (non-data)
• Thus, never gets more than three consecutive 0s• Resulting 5-bit codes are transmitted using NRZI • Achieves 80% efficiency
44
Binary Voltage Encoding
• Problem: wide frequency range required, implying– Significant dispersion– Uneven attenuation
• Prefer to use narrow frequency band (carrier frequency)
• Types of modulation– Amplitude (AM)– Frequency (FM)– Phase / phase shift– Combination of these (e.g. QAM)
45
Amplitude Modulation
idle idle 1 idle idle 0 idle idle
time
46
Frequency Modulation
idle idle 1 idle idle 0 idle
time
47
Phase Modulation
idle idle 1 idle idle 0 idle idle
time
48
Phase Shift in Carrier Frequency
108 degrees difference in phasecollapse for 108 degrees shift
49
Review Lecture 5
• Simplest possible network – 2 nodes connected directly
• Building blocks – nodes and links• Nodes – workstation architecture• Links – several types, optical, wireless• Encoding – binary data into signals, RS 232• Binary voltage encoding – NRZ, NRZI,
Manchester, 4B/5B• Modulation schemes
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 6
3
ITU’s V.32 - 9.6 kbps
• Communication between modems
• Analog phone line
• Uses a combination of amplitude and phase modulation– known as Quadrature Amplitude Modulation
(QAM)
• Sends one of 16 signals each clock cycle– transmits at 2400 baud, i.e., 2,400 symbols per
second
4
Constellation Pattern for V.32 QAM
For a given symbol:1. perform phase shift
2. change to new amplitude 450
150
• Points in constellation diagram– chosen to maximize error
detection– process called trellis coding
5
Quadrature Amplitude Modulation
• Same algorithm as phase modulation
• Can also change signal amplitude
• 2-dimensional representation– angle is phase shift
– radial distance is new amplitude
• Each symbol contains log2 16 = 4 bits– data rate is thus 4 x 2400 = 9600 bps
450
150
16-symbol example (V.32)
6
Generalizing the Examples
• What limits baud rate?
• What data rate can a channel sustain?
• How is data rate related to bandwidth?
• How does noise affect these bounds?
• What else can limit maximum data rate?
7
Bit Rate and Baud Rate
• Bit rate is bits per second
• Baud rate is “symbols” per second
• If each symbol contains 4 bits then data rate is 4 times the baud rate
8
What Limits Baud Rate ?
• Baud rates are typically limited by electrical signaling properties
• No matter how small the voltage or how short the wire, changing voltages takes time
• Electronics are slow as compared to optics
9
What data rate can a channel sustain ?How is data rate related to bandwidth ?
• Transmitting N distinct signals over a noiseless channel with bandwidth B, max. data rate can be 2B log2 N
• This observation is a form of Nyquist’s
Sampling Theorem– We can reconstruct any waveform with no
frequency component above some frequency “F” using only samples taken at frequency 2F
10
What else (besides noise) can limit maximum data rate ?
• Transitions between symbols introduce high frequency components into the transmitted signal
• Such components cannot be recovered (by Nyquist’s Theorem), and some information is lost
• Examples:– Pulse modulation uses only a single frequency (with
different phases) for each symbol, but the transitions can require very high frequencies
– Binary voltage encodings (0 Hz within symbols)– Eye diagrams show voltage traces for all transitions
11
How does Noise Affect these Bounds ?
• In-band (not high-frequency) noise blurs the symbols, reducing the number of symbols that can be reliably distinguished
• Shannon extended Nyquist’s work to
channels with additive white Gaussian noise (a good model for thermal noise)
• From Shannon’s Theorem :
Max. channel capacity C = B log2 (1+S/N)
12
Summary of Encoding
• Problems: attenuation, dispersion, noise
• Digital transmission allows periodic regeneration
• Variety of binary voltage encodings– High frequency components limit to short range
– More voltage levels provide higher data rate
• Carrier frequency and modulation– Amplitude, frequency, phase, and combination (QAM)
• Nyquist (noiseless) and Shannon (noisy) limits on data rates
13
Framing
14
Point-to-Point Links
• Reading: Peterson and Davie, Ch. 2
• Hardware building blocks• Encoding• Framing• Error Detection• Reliable transmission
– Sliding Window Algorithm
15
Framing• Breaks continuous stream/sequence of bits into a
frame and demarcates units of transfer• Typically implemented by network adaptor
– Adaptor fetches/deposits frames out of/into host memory
Frames
BitsAdaptor Adaptor Node BNode A
16
Advantages of Framing
• Synchronization recovery– consider continuous stream of unframed bytes– recall RS-232 start and stop bits
• Multiplexing of link– multiple hosts on shared medium– simplifies multiplexing of logical channels
• Efficient error detection– frame serves as unit of detection (valid or invalid)– error detection overhead scales as log N
17
Problem … ?
Recognizing exactly the boundaries of a frameMust determine the first and last bit of a
frame
18
Approaches
• Organized by end of frame detectionmethod
• Approaches to framing– sentinel (marker, like C strings)– length-based (like Pascal strings)– clock-based
19
Approaches
• Other aspects of a particular approach– bit- or byte-oriented– fixed- or variable-length– data-dependent or data-independent
length
20
Framing with Sentinels
• End of frame: special byte or bit pattern
• Choice of end of frame marker– valid data byte or bit sequence e.g. 01111110
– physical signal not used by valid data symbol
8 16 16 8
Beginning sequence Header Body CRC
Ending sequence
21
Sentinel Based Approach
• Problem: special pattern appears in the payload• Solution: bit stuffing
– sender: insert 0 after five consecutive 1s
– receiver: delete 0 that follows five consecutive 1s
Node BNode A
x 0 1 1 1 1 1
x 1 1 1 1 1 0
0 x 0 1 1 1 1 1 0
x 1 1 1 1 1 0
22
Sentinel Based Approach
• Problem: equal size frames are not possible– frame length is data-dependent
• Sentinel based framing examples– High-Level Data Link Control (HDLC)
protocol
– Point-to-Point Protocol (PPP)
– ARPANET IMP-IMP protocol
– IEEE 802.4 (token bus)
23
Sentinels: HDLC
• Developed by IBM, standardized by OSI
• Bit-oriented, variable-length, data-dependent
• Special bit pattern 01111110 marks end of frame
• Insert 0 after pattern 011111 in data (bit stuffing)
24
Sentinels: HDLC
• At receiver end, if the frame received is:– 0111110
• bit stuffed, therefore receive only 011111
• error in end of frame marker, lose two frames
– 01111110: end of frame– 01111111: error, lose one or two frames
25
Sentinels: PPP
• Byte-oriented, variable-length, data-dependent• Special flag 01111110 for start-of-text
– address and control field uses default values (FF / 8E)
– protocol field used for demultiplexing (IP,LCP,…)
– LCP (Link Control Protocol) send control messages
• establishes link between two peers
• negotiates payload and checksum size
• Insert 0 after pattern 011111 in data (bit stuffing)
flag address control protocol payload checksum flag
26
Sentinels: ARPANET IMP-IMP• Interface Message Processors (IMP’s): packet-
switching nodes in the original ARPANET
• Byte-oriented, variable-length, data-dependent• Special bytes (aid to understand frame format
below)– DLE: data link escape
– STX/ETX: start and end of transmission
• DLE byte in data sent as two DLE’s (byte stuffing)
SYN SYN DLE STX header body DLE ETX
27
Sentinels: IEEE 802.4 (token bus)• Alternative to Ethernet (802.3) with fairer
arbitration• End of frame marked by encoding violation, i.e.,
physical signal not used by valid data symbol• Recall Manchester encoding
– low-high means “0”, high-low means “1”, low-low and high-high are invalid
• Byte-oriented,variable-length,data-independent• Technique also applicable to bit-oriented framing• Another example: Fiber Distributed Data Interface
(FDDI) uses 4B/5B
28
Length-based Framing
• Include payload length in header• e.g., DDCMP (byte-oriented, variable-length)
• e.g. RS-232 (bit-oriented, implicit fixed length)
• Problem: count field corrupted
• Solution: catch when CRC fails
8 148
SYN SYN Class Length
8 42
Header
16
Body CRC
29
Clock-based Framing
• Continuous stream of fixed-length frames– each frame is 125µs long (all STS formats) (why?)
• Clocks must remain synchronized• e.g., SONET: Synchronous Optical Network
– dominated standard for long distance transmission– multiplexing of low-speed links onto one high-speed
link– byte-interleaved multiplexing– payload bytes are scrambled (data XOR 127 bit-pattern)– STS-n (STS-1 = 51.84 Mbps)
30
SONET Frame Format (STS-1)
Overhead Payload
90 columns
9 row s
31
Clock-based Framing
• Problem: how to recover frame synchronization– 2-byte synchronization pattern starts each
frame (unlikely to occur in data)
– wait until pattern appears in same place repeatedly
32
Clock-based Framing
• Problem: how to maintain clock synchronization– NRZ encoding, data scrambled (XOR’d)
with 127-bit pattern
– creates transitions
– also reduces chance of finding false sync. pattern
33
SONET Frame Merging
• STS-1 merged bytewise round-robin into STS-3– unmerged (single-source) format called STS-3c
STS-1Hdr
STS-1Hdr
STS-1Hdr
STS-3cHdr
34
SONET Frame Merging
• Problem: simultaneous synchronization of many distributed clocks– not too difficult to synchronize clocks
such that first byte of all incoming flows arrives just before sending first 3 bytes of outgoing flow (buffering ? delays ?)
35
Clock-based Framing• Problem: simultaneous synchronization of many
distributed clocks
• Solution: payload frame floats within clock frame, part of overhead specifies first byte of payload
Frame 0
Frame 1
87 col
9 rows
36
Error Detection
37
Point-to-Point Links
• Reading: Peterson and Davie, Ch. 2
• Hardware building blocks• Encoding• Framing• Error Detection• Reliable transmission
– Sliding Window Algorithm
38
Error Detection
• Why we need it ?– To avoid retransmission of whole
packet or message
• What to do if error detected ?– Discard, and request a new copy of
the frame:
• explicitly or implicitly
– Try to correct error, if possible
39
Error Detection
• Validates correctness of each frame
• Errors checked at many levels
• Demodulation of signals into symbols (analog)
• Bit error detection/correction (digital) our main focus– Within network adapter (CRC check)
– Within IP layer (IP checksum)
– Possibly within application as well
40
Error Detection
• Analog errors– Example of signal distortion– Discuss to illustrate input to digital level
• Hamming distance– Parity and voting– Concept and usefulness– Hamming codes
• Errors bits or error bursts• Digital error detection techniques: two-
dimensional parity, checksum, CRC
41
Analog Errors – Signal Distortion
• Consider RS-232 encoding of character ‘Q’
• Assume idle wire (-15V) before and after signal
• Calculate frequency distribution of signal A(f) using a Fourier transform
• Apply low-pass filter (drop high frequency components)
• Calculate signal using inverse Fourier transform
42
RS-232 Encoding of “Q”
Idle start 1 1 0 0 0 0 1 stop idle
+15
-15
volt
age
time
43
Frequency Distribution of ‘Q’
Encoding
0 1 2 3 4 5
5
0
volt
age-
tim
e
frequency (multiples of baud rate)
4
3
2
1
44
Limited-Frequency Signal Response(bandwidth = baud rate)
Idle start 1 1 0 0 0 0 1 stop idle
+15
-15
volt
age
45
Limited-Frequency Signal Response(bandwidth = baud rate/2)
Idle start 1 1 0 0 0 0 1 stop idle
+15
-15
volt
age
46
Review Lecture 6
• Bit rate and baud rate
• Nyquist and Shannon theorem
• Framing: demarcates units of transfer
• Advantages, problem: boundary
• End of frame detection approaches
• Sentinel, length, clock bsd, bit stuffing
• Error detection: avoid retransmission, discard
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 7
3
Error Detection and Correction
• Possible binary voltage encoding symbol
• Neighborhoods and erasure region
+15
-15
volt
age
0
1
? (erasure)
• Possible QAM symbol
• Neighborhoods in green
• All other space results in erasure
Input to digital level: valid symbols or erasures
4
Error Detection: How ?
• How to detect error ?– Add redundant information to a
frame to determine errors
• Transmit two complete copies of data– n redundant bits for n-bit message
– Error at the same position in two copies go undetected
5
Error Detection: How ?
• We want only k redundant bits for an n-bit message, where k < < n– In Ethernet, 32-bit CRC for 12,000 bits
(1500 bytes)
• k bits are derived from the original message
• Both the sender and receiver know the algorithm
6
1-bit Error Detection with Parity
• Every code has even number of 1’s
000100
110010
011
001 101
11101
00 10
11
Valid Codes
Parity encoding: gray dots are invalid and indicate errors
If only 1 bit flips, it can be detected
7
1-bit Error Correction with Voting
• Every code is copied three times
000100
110010
011
001 101
1110 1
Valid codes
Gray dots correct to “1”
blue cots correct to “0”
- If only 1 bit flips, it can be corrected
- Even with 2 erasures, bit can be recovered
8
2-bit Erasure Correction With Voting
• Every code is copied three times
Remaining bit in a 2-erasure plane is
not ambiguous
000 100
110010
011
001 101
111
0??
?0?
??0
Cannot correct 1-error and 1-erasure
9
Hamming Distance (1950 Paper)
• Minimum number of bit flips between code words– 2 flips for parity– 3 flips for voting
• n-bit error detection– No code word changed into another code
word– Requires Hamming distance of n+1
10
Hamming Distance (1950 Paper)
• n-bit error correction– N-bit neighborhood: all code words
within n bit flips– No overlap between n-bit
neighborhoods– Requires Hamming distance of 2n+1
11
Hamming Codes (1950 Paper)
• Construction for 1-bit error-correcting codes
• Minimal number of check bits required
• Construction– Number of bits from 1 upward
– Powers of 2 are check bits
– All others are data bits
– Check bit j is XOR of all bits k such that (j AND k) = j
• – Example: 4 bits of data, 3 check bits
12
Error Bits or Error Burst• Common model of errors
– Probability of error per bit– Error in each bit independent of others– Value of incorrect bit independent of others
• Burst model– Probability of back-to-back bit errors– Error probability dependent on adjacent bits– Value of errors may have structure
• Why assume bursts?– Appropriate for some media (e.g., Radio)– Faster signaling rate enhances such phenomena
13
Digital Error Detection Techniques
• Two-dimensional parity– Detects up to 3-bit errors– Good for burst errors
• Internet checksum (used as backup to CRC)– Simple addition– Simple in software
• Cyclic redundancy check (CRC)– Powerful mathematics– Tricky in software, simple in hardware– Used in network adapter
14
Two-Dimensional Parity
• Adding one extra bit to a 7-bit code to balance 1s
• Extra parity byte for the entire frame
• Catches all 1, 2 and 3 bit errors and most 4-bit errors
• 14 redundant bits for a 42-bit message, in the example
1011110 1
1101001 0
0101001 1
1011111 0
0110100 1
0001110 1
1111011 0
Paritybits
Paritybyte
Data
15
Two-Dimensional Parity
0 1 0 0 0 1 1 1
0 1 1 0 0 0 1 1
0 1 1 0 1 1 1 1
0 1 1 0 0 0 0 0
1 0 0 1 0 0 1 1
0 1 1 0 1 1 0 0
1 1 0 1 0 1 0 0
0 1
1
0 1
1
16
Internet Checksum Algorithm
• Not used at the link level but provides same sort of functionality as CRC and parity
• Idea:– Add up all words (16-bit integers) that are transmitted– Transmit the result (checksum) of that sum– Receiver performs the same calculation on received data
and compares the result with the received checksum– If the results do not match, an error is detected
• 16 redundant bits for a message of any length• Weak protection, accepted as a last line of defense
17
Internet Checksum AlgorithmView message as a sequence of 16-bit integers; sum using 16-bit ones-complement arithmetic; take ones-complement of the result.
u_short cksum(u_short *buf, int count){
register u_long sum = 0;while (count--){
sum += *buf++;if (sum & 0xFFFF0000){/* carry occurred, so wrap around */
sum &= 0xFFFF;sum++;
}}return ~(sum & 0xFFFF);
}
18
Cyclic Redundancy Check
Theory• Based on finite-field (binary-valued)
arithmetic
• Bit string represented as polynomial
• Coefficients are binary-valued
• Divide bit string polynomial by generator polynomial to generate CRC
Practice• Bitwise XOR’s
19
Cyclic Redundancy Check
• Add k bits of redundant data to an n-bit message– Want k << n
– e.g., k = 32 and n = 12,000 (1500 bytes)
• Represent n-bit message as n-1 degree polynomial– e.g., MSG=10011010 as M(x) = x7 + x4 + x3 + x1
– Sender and receiver exchange polynomials
• Let k be the degree of some agreed-upon divisor/ generator polynomial– e.g., C(x) = x3 + x2 + 1
20
Cyclic Redundancy Check
• Transmit polynomial P(x) that is evenly divisibleby C(x) – Shift left k bits, i.e., M(x)xk
– Add remainder of M(x)xk / C(x) into M(x)xk
• Receiver receives polynomial P(x) + E(x)– E(x) = 0 implies no errors
• Receiver divides (P(x) + E(x)) by C(x); remainder will be zero ONLY if:– E(x) was zero (no error), or– E(x) is exactly divisible by C(x)
21
CRC Example - Sender
• C(x) = x3+x2+1 = 1101 generator• M(x) = x8 + x6 + x5 + x4+1 = 101110001 message
1101 ) 101110001000
1101
1101
1101
00001000
1101
101 Remainder
22
CRC Example - Receiver
• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message
1101 ) 101110001101
1101
1101
1101
00001101
1101
0 Correct !
23
CRC Example - Receiver• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message
1101 ) 101101001101
1101
1100
1101
1100
1101
1110
1101
111 Incorrect !
Two bits are flipped
24
CRC Example - Receiver• C(x) = x3+x2+1 = 1101 generator• M(x) = x11+x9+x8+x7+x3+x2+1 = 101110001101 message
1101 ) 101101011101
1101
1100
1101
1101
1101
1101
1101
0 Incorrectly correct !!!
Three bits are flipped
25
Selecting C(x) Non-divisible by E(x)• All single-bit errors, as long as the xk
and x0 terms have non-zero coefficients.
• All double-bit errors, as long as C(x) contains a factor with at least three terms
• Any odd number of errors, as long as C(x) contains the factor (x + 1)
26
Selecting C(x) Non-divisible by E(x)• Any ‘burst’ error (i.e., sequence of
error bits) for which the length of the burst is less than k bits
• Most burst errors of larger than k bits can also be detected
• See Table 2.5 on page 96 for common C(x)
27
Error Detection or Correction ?
• Detection implies discardingmessage and waiting for retransmission– Uses bandwidth
– Introduces latency
28
Error Detection or Correction ?
• Error correction requires more redundant bits to send all the time : Forward Error-correcting Code (FEC)
• Error correction is useful when:– Errors are quite probable (wireless links)
– Retransmission cost is too high (latency in satellite link, multicast)
29
Topics Already Covered
• Elements of networks: nodes and links
• Building a packet abstraction on a point-to-point link– Transmission methods and challenges
– Limiting factors on data rates
– Defining units of communication data
– Detecting transmission errors
• Next: simulating an error-free channel
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 8
3
Reliable Transmission
• Higher level of abstraction (transport layer vs. data link layer)
mycomputer’s
nameis
my-machine
computer’s
ismy-machine
client
my-machine
server
mail.yahoo.com
4
Reliable Transmission
• Higher level of abstraction (transport layer vs. data link layer
client
my-machine
server
mail.yahoo.commy
computer’s
nameis
my-machine
my-machineis
mycomputer’s
name
5
Reliable Transmission
• Error-correcting codes are not advancedenough to handle the range of bit and burst errors– Corrupt frames generally must be discarded
– A reliable link-level protocol must recover from discarded frames
• Goals for reliable transmission– Make channel appear reliable
– Maintain packet order (usually)
– Impose low overhead / allow full use of link
6
Reliable Transmission
• Reliability accomplished using acknowledgments and timeouts– ACK is a small control frame
confirming reception of an earlier frame
– Having no ACK, sender retransmits after a timeout
7
Reliable Transmission
• Automatic Repeat reQuest (ARQ) algorithms– Stop-and-wait
– Concurrent logical channels
– Sliding window
• Go-back-n, or selective repeat
• Alternative: forward error correction (FEC)
8
Automatic Repeat reQuest
• Acknowledgement (ACK)– Receiver tells sender when frame received
– Cumulative ACK (used by TCP): have received specified frame and all previous
– Selective ACK (SACK): specifies set of frames received
– Negative ACK (NACK or NAK): receiver refuses to accept frame now, e. g. , when out of buffer space
9
Automatic Repeat reQuest
• Timeout: sender decides that frame was lost and tries again
• ARQ also called Positive Acknowledgement with Retransmission (PAR)
10
Stop-and-Wait• Send a single frame
• Wait for ACK or timeout– If ACK received, continue with next frame
– If timeout occurred, send again (and wait)• Frame lost in transit; or corrupted and discarded
Sender Receiver
Frame 0
Frame1
ACK0
ACK1
11
Acknowledgments and TimeoutsSender Receiver
Frame
ACK
Tim
eout
Tim
e
Sender Receiver
Frame
ACK
Tim
eout
Frame
ACKTim
eout
Sender Receiver
Frame
ACKTim
eout
Frame
ACKTim
eout
Sender Receiver
Frame
Tim
eout
Frame
ACKTim
eout
(a) (c)
(b) (d)
12
Stop-and-Wait
• If receiver receives a frame correctly, but sender receives the ACK after timeout …– Sender resends the frame; how the
receiver knows it’s the same frame or the next frame ?
13
Stop-and-Wait
• Requires frame identification– Duplicate frame ?
– Duplicate ACK ?
– 1 bit is enough (if physical network maintains order)
• sender tracks frame ID to send
• receiver tracks next frame ID expected
14
Stop-and-Wait State Diagram
send: 1expect: 0
send: 0expect: 0
send: 0expect: 1
send: 1expect: 1
receiveframe 0
receiveframe 1
receiveACK 1
receiveACK 0
receive frame 0 /receive ACK 1
15
Stop-and-Wait• Frames delivered reliably and in order• Is that enough ?
– No, we need performance, too.
• Problem: keeping the pipe full … ?
• Example– 1.5Mbps link x 45ms RTT = 67.5Kb (~8KB)– 1KB frames implies 182 Kbps (1/8th link utilization)– Want the sender to transmit 8 frames before waiting
for ACK– Throughput remains 182 Kbps regardless of the link
bandwidth !!
16
Concurrent Logical Channels• Multiplex several logical channels over a single
p-to-p physical link (include channel ID in header)
• Use stop-and-wait for each logical channel
• Maintain three bits of state for each logical channel:– Boolean saying whether channel is currently busy
– Sequence number for frames sent on this channel
– Next sequence number to expect on this channel
• ARPANET IMP-IMP supported 8 logical channels over each ground link (16 over each
17
Concurrent Logical Channels
• Header for each frame include 3-bit channel number and 1-bit sequence number– Same number of bits (4) as the sliding
window requires to support up to 8 outstanding frames on the link
18
Concurrent Logical Channels
• Characteristics– Separates reliability from flow control and
frame order
– Each channel limited by stop-and-wait bandwidth
– Aggregate bandwidth uses full physical link
– Supports multiple communicating processes
– Can use more than one channel per process• But no frame ordering between channels
19
Approaches for Reliable Transmission …
• Stop- and- wait– Provides reliable, in-order delivery
– Sacrifices performance
• Multiple logical channels– Provides reliable delivery at full link bandwidth
– Sacrifices packet ordering
• Sliding window: meets all three goals
20
Sliding Window• Allow sender to transmit multiple frames before
receiving an ACK, thereby keeping the pipe full• Upper bound on outstanding un-ACKed frames• Also used at the transport layer (by TCP)
Sender Receiver
Tim
e
……
21
Sliding Window Concepts• consider ordered stream of data
– broken into frames– stop-and-wait
• window of one frame• slides along stream over time
• sliding window algorithms generalize this notion– multiple-frame send window– multiple-frame receive window
time
22
Sliding Window Concepts
• send window– fixed length, containing numbered frames– starts at earliest unacknowledged frame– only frames in window sent over network
– Green: sent and acknowledged– Red: sent (or can be sent) but not acknowledged– Blue: available, but not within send window
time
23
Sliding Window Concepts
• receive window– fixed length (unrelated to send window)– starts at earliest unreceived frame– only frames in window are buffered
– Green: received and delivered– Red: received and buffered– Blue: received and discarded
time
24
Sliding Window - Sender• Assign sequence number to each frame (SeqNum)• Maintain three state variables:
– send window size (SWS)– last acknowledgment received (LAR)– last frame sent (LFS)
• Maintain invariant: LFS - LAR ≤ SWS
• Advance LAR when ACK arrives • Buffer up to SWS frames and associate timeouts
time
14 1512 1311 19 2017 1816
LAR=13 LFS=18≤ SWS
25
Sliding Window - Receiver• Maintain three state variables
– receive window size (RWS)– largest frame acceptable (LFA)– next frame expected (NFE)
• Maintain invariant: LFA – NFE+1 ≤ RWS
• Frame SeqNum arrives:– if NFE ≤ SeqNum ≤ LFA accept– if SeqNum ≤ NFE or SeqNum > LFA discarded
• Send cumulative ACKs
time
14 1512 1311 19 2017 1816
NFE=13 LFA=17≤ RWS
26
Sliding Window Issues• When a timeout occurs, data in transit decreases
– Pipe is no longer full when packet losses occur
– Problem aggravates with delay in packet loss detection
• Early detection of packet losses improves performance:– Negative Acknowledgements (NACKs)
– Duplicate Acknowledgements
– Selective Acknowledgements (SACKs)
• Adds complexity but helps keeping the pipe full
27
Sliding Window Classification
• Stop-and-wait: SWS=1, RWS=1
• Go-back-N: SWS=N, RWS=1
• Selective repeat: SWS=N, RWS=M(usually M = N)
selective repeat
go-back-N
stop-and-wait
28
Sliding Window: Go-back-N
• Go-back-N (SWS=N, RWS=1)
• Receiver only buffers 1 frame
• If frame lost, sender may need to resend N frames– i. e. , sender goes back N frames
• Variations– How long is the frame timeout?
– Does receiver send NACK for out-of-sequence frame?
29
Sliding Window: Selective Repeat
• Selective repeat (SWS=N, RWS=M)• Receiver buffers M frames• If frame lost, sender must resend only
– frames lost within receive window– frames past end of receive window
• Variations– How long is the frame timeout?– Use cumulative or per-frame ACK?– Does protocol adapt timeouts?– Does protocol adapt SWS and/or RWS?
30
Sequence Number Space
• SeqNum field is finite; sequence numbers wrap around
• Sequence number space must be larger than number of outstanding frames (SWS)
• SWS <= MaxSeqNum-1 is not sufficient– suppose 3-bit SeqNum field (0..7); SWS=RWS=7– sender transmits frames 0..6; which arrive successfully
(receiver window advances)– ACKs are lost; sender retransmits 0..6– receiver expecting 7, 0..5, but receives second
incarnation of 0..5 assuming them as 8th to 13th frame
31
Required Sequence Number Space ?
• Assume SWS=RWS (simplest, and typical)– Sender transmits full SWS– Two extreme cases at receiver
• None received (waiting for 0…SWS-1)• All received (waiting for SWS…2*SWS-1)
• All possible packets must have unique SeqNum• SWS < (MaxSeqNum+1)/2 or SWS+RWS < MaxSeqNum+1 is the correct rule
• Intuitively, SeqNum “slides” between two halves of sequence number space
32
Sliding Window Assumptions
• Sliding window protocol leads to in-order delivery of all frames, with the following assumptions– Frames can be delayed an arbitrary but finite
amount of time
– Frames can be lost– Frames can arrive with detectable errors– Frames cannot arrive with undetectable errors
– Frames arrive in the order sent (possibly with errors)
• Are these assumptions adequate?
33
Sliding Window Correctness
• Need one more assumption:– Any given frame is received without errors
after a finite number of retransmissions
• proof in two steps– establish correctness assuming infinite
sequence number space
– show that finite sequence number space does not affect result as long as it has >= 2 max (SWS, RWS) possible numbers
34
Separation of Concerns
• Carefully distinguish different functions rolled together in one mechanism– Each function is necessary here ?– Each function is supported in best effective way ?
• Sliding window combines– Reliable delivery– Ordered delivery– Flow control
• Is it the right thing to do at link level ?
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 9
3
Sequence Number Space
•SeqNum field is finite; sequence numbers wrap around
• Sequence number space must be larger than number of outstanding frames (SWS)
4
Sequence Number Space
•SWS <= MaxSeqNum-1 is not sufficient– Suppose 3-bit SeqNum field (0..7); SWS=RWS=7
– Sender transmits frames 0..6; which arrive successfully (receiver window advances)
– ACKs are lost; sender retransmits 0..6
– Receiver expecting 7, 0..5, but receives second incarnation of 0..5 assuming them as 8th to 13th frame
5
Required Sequence Number Space ?
• Assume SWS=RWS (simplest, and typical)– Sender transmits full SWS
– Two extreme cases at receiver
• None received (waiting for 0…SWS-1)
• All received (waiting for SWS…2*SWS-1)
6
Required Sequence Number Space ?
• All possible packets must have unique SeqNum
•SWS < (MaxSeqNum+1)/2 or SWS+RWS < MaxSeqNum+1 is the correct rule
• Intuitively, SeqNum “slides” between
two halves of sequence number space
7
What Next ?
• Arbitrating access to a shared medium
• After that: network adapters and example protocols
8
Shared Access Networks
Outline• Bus (Ethernet)
• Token ring (FDDI)
• Wireless (802.11)
• Network Adapter
9
Shared Access Media
• Arbitrating access to a shared media– Multiple hosts on a single link
10
Multiple Access Media
• Multiple senders on some media– Buses (Ethernet, including links in
switched form)
– Radio, satellite
– Token rings…
11
Multiple Access Media
• Need method to moderate access– Fair arbitration– Good performance
12
Shared Media
• Communication needs vary– Over time
– Between hosts
• Network is not fully utilized
13
Shared Media
• Recall methods for multiplexing– Frequency-division multiplexing (FDM,
separate bands)
– Time-division multiplexing (TDM, synchronous time slots)
– Statistical TDM (STDM, time slots on demand)
• STDM most appropriate with stated assumptions
14
Shared Media: Problems• Problem: demands can conflict, e. g. ,
two hosts send simultaneously– STDM does not address this problem -
centralized
– Solution is a medium access control (MAC) algorithm
15
Shared Media: Solutions• Three solutions (out of many)
– Carrier sense multiple access with collision detection (CSMA / CD)
• Send only if medium is idle
• Stop sending immediately if collision detected
– Token ring/FDDI pass a token around a ring; only token holder sends
– Radio / wireless (IEEE 802.11)
16
Ethernet
17
History of Ethernet
• Developed by Xerox PARC in mid-1970s• Roots in Aloha packet-radio network• Standardized by Xerox / DEC / Intel in
1978• Similar to IEEE 802.3 standard• IEEE 802.3u standard defines Fast
Ethernet (100 Mbps)• New switched Ethernet now popular
18
Ethernet Topologies
• Bus— all nodes connected to a wire
• Star— all nodes connected to a central repeater
• Combinations thereof
. . .
19
Ethernet Adaptor
• Segment of up to 500 m• Nodes tap into segments• Taps must be 2.5m apart• Transceiver performs
carrier sensing• Transceiver transmits
and receive signals• Protocol is implemented
in the adaptor
Adaptor
Transceiver
Host
Ethernet cable
20
Ethernet – Alternative Technologies
• Can be constructed from a thinner cable (10Base2) rather than 50-ohm coax cable (10Base5)
• Newer technology uses 10BaseT (twisted pair)– Several point-to-point segments coming out of a
multiway repeater called “hub”
Hub Hub
21
Ethernet Components
10Base5 (ThickNet)
Controller (Ethernet Card)
Vampire Tap
Transceiver
Bus Topology
22
Ethernet Components
10Base2 (ThinNet)
Controller (Ethernet Card)
BNC T-junctionTransceiver
Bus Topology
23
Ethernet Components
10BaseT (Twisted Pair)
Controller (Ethernet Card)
HubStar Topology
24
Ethernet – Multiple Segments
• Repeaters forward the broadcast signal on all out going segments (10Base5)
• Maximum of 4 repeaters (2500m), 1024 hosts
Repeater
Host
…
…
…
25
Ethernet Packet Frame
• Preamble allows the receiver to synchronize with signal
• Frame must contain at least 46 bytes to detect collision
• 802.3 standard substitutes length with type field– Type field (demux key) is the first thing in data portion– A device can accept both frames: type > 1500
Destaddr
64 48 32
CRCPreamble Srcaddr
Type Body
1648
26
Ethernet Address
• Addresses– Unique, 48-bit unicast address assigned to each
adapter– Example: 8:0:e4:b1:2
– Broadcast: all 1s
– Multicast: first bit is 1
– Promiscuous mode
• Problem remains: A distributed algorithm that provides fair access
27
Ethernet MAC – CSMA/CD
• Multiple access– Nodes send and receive frames over a
shared link
• Carrier sense– Nodes can distinguish between an idle
and busy link
• Collision detection– A node listens as it transmits to detect
collision
28
CSMA/CD MAC Algorithm
• If line is idle (no carrier sensed)– Send immediately
– Upper bound message size of ~1500 bytes
– Must wait 9.6µs between back-to-back frames
29
CSMA/CD MAC Algorithm
• If line is busy (carrier sensed) …
– Wait until the line becomes idle and then transmit immediately
– Called 1-persistent (special case of p-persistent)
• If collision detected– Stop sending data and jam signal
– Try again later
30
Collision Detection
How to ensure that my-machine knows about the collision?
Start transmission
at time 0
my-machine your-machine
Start transmission
at time T
Almost there at time T
Collision !!!
31
Constraints on Collision Detection
• In our example, consider– my-machine’s message reaches your-
machine at T
– your-machine’s message reaches my-machine at 2T
• Thus, my-machine must still be transmitting at 2T
32
Constraints on Collision Detection
• Specifics of IEEE 802.3– Bounds 2T to 51.2 microseconds
– Packet must be at least 64B long
• Jam after the collision, for 32 bits, then stop transmitting frame (runt frame of 96 bits)– Ensures that all hosts notice collision
33
Review Lecture 9
• Shared access networks
• Shared media: issues
• Ethernet Topologies, technologies
• Segments
• Frame format, Addresses
• MAC protocol: CSMA/CD
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 10
3
Ethernet MAC – CSMA/CD
• Multiple access– Nodes send and receive frames over a
shared link
• Carrier sense– Nodes can distinguish between an idle
and busy link
• Collision detection– A node listens as it transmits to detect
collision
4
CSMA/CD MAC Algorithm
• If line is idle (no carrier sensed)– Send immediately
– Upper bound message size of ~1500 bytes
– Must wait 9.6µs between back-to-back frames
5
CSMA/CD MAC Algorithm
• If line is busy (carrier sensed) …
– Wait until the line becomes idle and then transmit immediately
– Called 1-persistent (special case of p-persistent)
• If collision detected– Stop sending data and jam signal
– Try again later
6
Collision Detection
How to ensure that my-machine knows about the collision?
start transmission
at time 0
my-machine your-machine
start transmission
at time T
almost there at time T
collision !!!
7
Constraints on Collision Detection
• In our example, consider– my-machine’s message reaches your-
machine at T
– your-machine’s message reaches my-machine at 2T
• Thus, my-machine must still be transmitting at 2T
8
Constraints on Collision Detection
• Specifics of IEEE 802.3– bounds 2T to 51.2 microseconds
– packet must be at least 64B long
• Jam after the collision, for 32 bits, then stop transmitting frame (runt frame of 96 bits)– ensures that all hosts notice collision
9
Ethernet Min. Frame Size
• RTT on a maximally configured Ethernet of 2500m, with 4 repeaters is about 51.2 μs
– 2500m / 2 x 108 m/s = 12.5 us
– 2 x 12.5 = 25 us + repeater delays
• 51.2 μs on 10 Mbps corresponds to 512 bits
(64 bytes)
• Therefore, the minimum frame length for Ethernet is 64 bytes (header +46 bytes data)
10
Collision Detection
my-machine notices collision before transmission ends at 2T
start transmission
at time 0
my-machine your-machine
start transmission
at time T
almost there at time T
collision
11
Retry After the Collision• How long should a host wait to
retry after a collision ?– Binary exponential backoff
• maximum backoff doubles with each failure (exponential)
• after N failures, pick an N-bit number
• 2N discrete possibilities from 0 to maximum
12
Retry After the Collision
• Delay and try again algorithm–1st time: 0 or 51.2us–2nd time: 0, 51.2, or 102.4us–3rd time51.2, 102.4, or 153.6us–nth time: k x 51.2us, for randomly
selected k=0..2n - 1–Give up after several tries
(usually 16)
13
Binary Exponential Backoff
• Why used fixed time slots ?
• How long slots should be ?
time
choices after 1 collision
choices after 2 collisions
3Ts2TsTs0
time of last collision
14
Binary Exponential Backoff
• Ts is 51.2 microseconds for IEEE 802.3
• Consider that k hosts collide– each picks random number from 0 to 2( N-1)
– if minimum value is unique
• all other hosts see busy line
• remember that Ethernet RTT < 51.2 microseconds
15
Binary Exponential Backoff
• Consider that k hosts collide– if minimum value is not unique
• hosts in minimum value slot collide again
• following slot idle
• consider next smallest backoff value
16
Ethernet Frame Reception
• Sender handles all access control• Receiver simply pulls frames from
network
• Ethernet controller/card– sees all frames
– selectively passes frames to host processor
17
Ethernet Frame Reception
• Acceptable frames– addressed to host, or;
– to multicast address to which host belongs or;
– to a broadcast address
– anything if in promiscuous mode (packet sniffing and tcpdumprequirement)
18
Ethernet Collision Detection• Bus topology Ethernets
– Transceiver handles• Carrier detection• Collision detection• Jamming after a collision
– Transceiver sees voltage sum• Outgoing signal + Incoming signal
– Looks for voltages impossible for local alone• Attenuation can prevent detection• Limits segment length
19
Ethernet Collision Detection
• Hub topology Ethernets– Controller/card handles carrier detection– Hub handles
• Collision detection• Jamming after a collision
– Detect transmission activity on each line– If more than 1 line (host) active
• Assert collision to all lines• Continue until no lines active
– Attenuation is less critical
20
Experience With Ethernet
• Number of hosts limited to 200 in practice, standard allows 1024
• Range much shorter than 2.5 kmlimit in standard
• Round-trip time is typically 5 or 10 μs, not 50μs
21
Experience With Ethernet
• Higher-level flow control (i.e., TCP) limits load– 30% maximum load recommended
– Large packets recommended
• Star topologies easier to administer than Bus– Failure isolation, node addition w/o complex
topology
• (Backwards) Compatibility more important than elegance or raw performance (1 > 0)
22
Token Ring
23
Token Ring Overview
• Token Ring network “was” a candidate to replace
Ethernet; used in some MAN backbones– 16Mbps IEEE 802.5 (based on earlier 4Mbps IBM ring)
– 100Mbps Fiber Distributed Data Interface (FDDI)
24
Token Ring Rationale
• Why emulate a shared medium with point-to-point links?
• Why a shared medium?– convenient broadcast capabilities
– switches are costly
• Why emulation?– simpler MAC algorithm: only have 2 wires
– fairer access arbitration
– fully digital (802.3 collision detection requires analog)
25
IBM Token Ring – IEEE 802.5
• Data flows in a particular direction– a node receives frames from its
upstream neighbor
– a node forwards frames to its downstream neighbor
26
IBM Token Ring – IEEE 802.5
• Ring is viewed as a single shared medium– each node is allowed to transmit according
to some distributed algorithm for medium access
– all nodes see all frames; destination saves a copy of frame as it flows past
• The term “token” indicates the way the access to shared channel is managed
27
Token Ring State Diagram
No token(pass frames along)
Token(insert new frames)
get token release token
28
Token in a Token Ring
• Token is a special bit pattern that rotates around the ring– A node must capture token before
transmitting
– A node releases token after done transmitting
• Immediate release- token follows last frame (FDDI)
• Delayed release – after last frame returns to sender
29
Token Release
(a) immediate release (b) delayed release
30
Token in a Token Ring
• Remove your frame when it comes back around– Transmit another frame or re-insert
the token
• Stations get round-robin serviceas the token circulates around the ring
31
Review Lecture 10
• CSMA/CD MAC algorithm
• Constraints on collision detection
• Min frame size – RTT 51.2us
• Retries after collision
• Binary exp backoff: Min value unique
• Frame reception: acceptable frames
• Bus and Hub topology: CD handle
• Exp: nodes, dist, RTT, 30% load, largeframe
32
Review Lecture 10
• Token Ring
• MAN backbone
• IBM, 802.5 token ring, FDDI
• Rationale: emulate shared med p2p links
• Directional data flow
• Shared med: see all frames, distr MAC algo
• Token – special bit pattern
• Capture and release – round robin service
• When a node receives a claim frame, it
11
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
2
Lecture No. 11
3
Review Lecture 10
• CSMA/CD MAC algorithm
• Constraints on collision detection
• Min frame size – RTT 51.2us
• Retries after collision
• Binary exp backoff: Min value unique
• Frame reception: acceptable frames
• Bus and Hub topology: CD handle
• Exp: nodes, dist, RTT, 30% load, largeframe
4
Review Lecture 10
• Token Ring
• MAN backbone
• IBM, 802.5 token ring, FDDI
• Rationale: emulate shared med p2p links
• Directional data flow
• Shared med: see all frames, distr MAC algo
• Token – special bit pattern
• Capture and release – round robin service
5
Physical Properties
• A node is connected into the ring using a relay– Relay is open as long as the station is alive (a)
– Relay closes and bypasses the station if it is dead (b)
Host
From previoushost
To nexthost
Relay
(a)
Host
From previoushost
To nexthost
Relay
(b)
6
Physical Properties
• Several relays are packed into a single box: multi Station Access Unit (MSAU)– Stations can be added/removed by
plugging/unplugging
– IBM token ring requires MSAU but 802.5 does not
7
MSAU
Host
Host
Host Host
From previousMSAU
To nextMSAU
MSAU
8
Physical Properties
• Data rate can be 4 Mbps or 16 Mbps• Encoding of bits uses differential
Manchester
• Ring may have up to 250 (802.5) or 260 (IBM) nodes
• Physical medium is twisted pair (IBM Token Ring)
9
Token Ring MAC
• Network adaptor contains receiver, transmitter and some storage of bits between them
• Token circulates if no station has anything to send– Ring must have enough capacity to store entire
token
– At least 24 stations with 1-bit storage for 24-bit long token (if propagation delay is negligible)
– This situation is avoided by designating a monitor
10
Token Ring MAC
• Any station that has a data to send can seize token
• In 802.5, simply 1 bit in second byte token is modified
• First two bytes of modified token become preamble for the next frame
11
Token Ring MAC
• The station is then allowed to send one or more frames
• Destination address may be a multicast or a broadcast address
• Intended recipient copies the frame into a buffer
12
Token Ring MAC
• Sender is responsible for removingthese frames
• Frame longer than the capacity of ring is drained by the sender while still transmitting its later part
13
Frame Format
• “Illegal” Manchester codes in the start and end delimiters
• Frame priority and reservation bits in access control byte
• Demux key in frame control byte• A and C bits for reliable delivery, in status
byte
Body CRCSrcaddr
Variable48
Destaddr
48 32
Enddelimiter
8
Framestatus
8
Framecontrol
8
Accesscontrol
8
Startdelimiter
8
14
Token Ring MAC Issue
• A node captures the token and sends as much data as it has …
– A node could keep the token for arbitrarily long time, and other nodes are not given the turn
– Favors nodes with large data over nodes having a small message to send
• How much data a node is allowed to transmit each time it capture the token ?
15
Timed Token Algorithm
• Token Holding Time (THT)– Upper limit on how long a station can
hold the token– A node checks before putting each frame
on ring that its transmit time would not cause THT to exceed
– Long THT achieves better utilization with few senders
– Short THT helps when multiple nodes have data to send
16
Timed Token Algorithm
• Token Rotation Time (TRT)–How long it takes the token to
traverse the ring.–TRT <= ActiveNodes x THT
+ RingLatency
17
Reliable Delivery
• The A and C bit in the packet trailer for reliability
• Both bits are initially set to 0
• Destination sets A bit if it seesthe frame and sets C bit if it copies the frame into its adaptor
18
Reliable Delivery
• If sender receives bit A=0, it assumes destination is absent (not functioning)
• If A bit is set but not bit C, it means destination is unable to accept the frame– Sender retransmits this frame later
19
Token Ring Packet Priorities
• Token contains 3-bit priorityfield
• Token priority changes over time by using reservation bitsin the frame header
20
Token Ring Packet Priorities
• A station willing to send priority npacket can set reservation bits to n, if this makes it lower in value– it captures the token when the current
sender releases it with priority set to n
• Strict priority scheme: no lower-priority packets get sent when higher priority packets are waiting
21
Token Maintenance
• Token rings have a designated monitornode
• Any station can become the monitor according to a well defined procedure
• Monitor is elected when the ring is first connected, or when the current monitor fails
22
Token Maintenance
• Monitor periodically announces its presence
• Claim token sent by a station seeing no monitor– if the sender receives back the claim
token, it becomes monitor
– if another station is also contending for monitor, some rule defines the monitor
23
Monitor’s Role
• May add additional delay into the ring
• Assures that a valid token is present in the ring because token may be lost:– No token when initializing the ring– Bit error corrupts token pattern– Node holding the token crashes
24
Monitor’s Role
• Maintains a timer to detect the missing token– Timer = NumStations x THT +
RingLatency
– When times expires, it creates a new token
• Checks for corrupted frames– checksum error, invalid format, may
circulate forever
– Monitor drains them off the ring
25
Monitor’s Role
• Checks for orphaned frames– sending station died before removing the
frame
– “Monitor” bit is used to detect orphaned frames
– Initially 0, and set to 1 when a frame passes the monitor
– If the monitor sees a frame with this bit set, it drains it off
26
Review Lecture 11
• Token Ring phy properties: MSAU, speed,#
• MAC, storage, circulation, seize+ modify
• Send 1+ frame, recipient cpy, remove frame
• Frame format, resv, priority, A, C bits
• MAC issues, THT, TRT
• Reliable delivery, A, C
• Frame priorities, resv
• Maintenance, monitor, claim, role
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 12
3
Fiber Distributed Data Interface• Similar to 802.5/IBM token rings but runs on fiber• Consists of a dual ring: two independent rings that
transmit data in opposite directions at 100Mbps
• Tolerates a single link break or node failure (self-healing ring)
(a) (b)
4
FDDI - Concentrator• Allows nodes to attach using a single cable - SAS• Dual connected nodes still exist - DAS• Concentrator attaches several SASs to dual ring
– Uses optical bypass to isolate failed SAS
Downstreamneighbor (DAS)
Upstreamneighbor (DAS)
Concentrator (DAS)
SAS SAS SAS SAS
5
FDDI – Physical Properties
• Variable size buffer (9 - 80 bits) between input and output interfaces (10 ns bit time)– Not required to fill buffer before
starting transmission
• Maximum 500 stations, maximum 2 km distance between any pair of stations
6
FDDI – Physical Properties
• Total 200 km fiber: dual nature implies 100 km cable connecting all stations
• Physical media can be coax or twisted pair cable
• Uses 4B/5B encoding
7
Timed Token Algorithm
• Token Holding Time (THT)– Upper limit on how long a station can
hold the token
– configured to some suitable value
• Token Rotation Time (TRT)– How long it takes the token to traverse
the ring (time since a host released the token)
–TRT <= ActiveNodes x THT + RingLatency
8
Timed Token Algorithm
• Target Token Rotation Time (TTRT)–“agreed-upon” or negotiated
upper bound on TRT
9
MAC Algorithm
• Each node measures TRT between successive token arrivals
• If measured-TRT > TTRT–Token is late
–Can not send data
10
MAC Algorithm
• If measured-TRT < TTRT– Token is early so OK to send
– Send data for remaining time until either
• No more data to send
• THT >= (TTRT – measured TRT)
11
FDDI MAC Issue
• If a node has lots of data, it holds the token for the maximum allowed time
• When a downstream neighbor gets the token, its measured TRT >= TTRT–It cannot transmit its frame!
12
FDDI MAC Issue
• What if the downstream neighbor has some urgent data to send ?
13
FDDI Traffic Classes
• Synchronous traffic–Latency sensitive–Gets higher priority–Can always send data
14
FDDI Traffic Classes
• Asynchronous traffic–Sensitive to throughput rather
than delay–Lower priority–Can send only if token is early
• May cause the time to exceed by one FDDI frame
15
Bounded Priority Traffic
• If a node has large amount of synchronous data– It will send regardless of measured TRT
– TTRT will become meaningless !!!
• Therefore, total synchronous data during one token rotation is bounded by TTRT
16
Bounded Priority Traffic
• Worse case: 2xTTRT between seeing token– One TTRT is consumed first by
asynchronous data
– Another TTRT is then consumed by synchronous data
• Back-to-back 2xTTRT rotations not possible
17
Token Maintenance
• Monitoring for the lost token–No token when initializing ring
–Bit errors corrupt token pattern
–Node holding the token crashes
18
Token Maintenance
• Monitoring for a valid token–Should periodically see valid
transmission (frame or token)
–Max. gap = ring latency + max frame <= 2.5ms
• Set 2.5ms timer; start negotiations if it fires
19
Token Maintenance• The procedure when a node
– Joins the ring (startup)– Suspects a failure
• Claim frame is used in order to– Generate a new Token– Agree on TTRT (so that an
application can meet its timing constraints)
• A node can send a claim frame without holding the token
20
Token Maintenance Procedure
• A node sends a claim frame including its TTRT bid
• When a node receives a claim frame, it compares the bid with its own bid– If its bid is higher, it updates TTRT &
forward the frame
– If its bid is lower, it replaces with its own claim frame
– If bids are equal, higher address node wins
21
Token Maintenance Procedure
• If a node’s claim frame
returns back to it, it knows:–Its bid was the lowest–Everyone knows TTRT–It can now insert new token
22
Frame Format• 4B/5B control symbols for start and end of frame• Control Field
– 1st bit: asynchronous (0) versus synchronous (1) data– 2nd bit: 16-bit (0) versus 48-bit (1) addresses– Last 6 bits: demux key (includes reserved patterns for
token and claim frame)
• Status Field– From receiver back to sender; error in frame– Recognized address; accepted frame (flow control)
Body CRCSrcaddr
Variable48
Destaddr
48 32
End offrame
8
Status
24
Control
8
Start offrame
8
23
Feedback
• Error detection– Host attaches “error” marker to frame
– Sender detects error marker, resends later
• Flow control– Host attaches “my address but did not
copy”
– Sender detects problem, resends later (backs off)
24
Wireless LANs
25
Wireless LANs
• IEEE 802.11 standard– Designed for use in a small area (offices,
campuses)
• Bandwidth: 1, 2 or 11 Mbps– Up to 56Mbps in newer 802.11a standard
• Targets three physical media– Two spread spectrum radio (2.4GHz freq)
– One diffused infrared (10m range, 850 nm band)
26
Spread Spectrum
• Spread signal over wider freq band– Uses more frequency spectrum than
strictly necessary
• Originally designed to thwart jamming/ interference– Pseudo-random sequence, signal looks like
a noise
• Introduce pseudo-random componentinto signal
27
Spread Spectrum
• Sender and receiver share– Pseudorandom number generator and
the seed
• Frequency Hopping– Transmit over pseudo-random
sequence of frequencies
– 802.11 uses 79 x 1MHz-wide frequency bands
28
Direct Sequence Spread Spectrum
• For each bit, send XOR of the bit and n random bits
• Random sequence is known to sender and receiver
• n random bits are called n-bit chipping code
29
Direct Sequence Spread Spectrum
• 802.11 defines an 11-bit chipping code 83MHz band
Random sequence: 0100101101011001
Data stream: 1010
XOR of the two: 1011101110101001
0
0
0
1
1
1
11
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
2
Lecture No. 13
3
802.11 MAC: CSMA/CA
• Similar to Ethernet …
– Defer the transmission until the link becomes idle
– Take back off if collision occurs
• Is it sufficient ?
• All nodes are not always within reach of (to hear) each other
4
Hidden and Exposed Nodes• Hidden nodes
– Sender thinks its OK to send when its not (false +ve)– A-C and B-D are hidden nodes in the figure below
• Exposed nodes– Sender does not send when its OK to send (false –ve)– B and C are exposed nodes in the figure below
A B C D
5
Multiple Access with Collision Avoidance (MACA)
• Sender transmits RequestToSend(RTS) frame– contains intended time to hold the
medium
• Receiver replies with ClearToSend (CTS) frame
6
Multiple Access with Collision Avoidance (MACA)
• Neighbors of the receiver …
–hear CTS: keep quiet for the intended duration (or till the ACK is heard)
• Neighbors of only the sender–hear RTS but not CTS: OK to
transmit
7
MACA for Wireless (MACAW)
• Receiver sends an ACKwhen it receives a frame–Signal for neighbors to start
their transmission
–Its neighbors are silent until they hear ACK
8
MACA for Wireless (MACAW)
• Collision detection–No active collision detection
–Known only if CTS or ACK is not received
–Binary exponential back off (BEB) is used in case of collision, like in Ethernet
9
802.11 - Supporting Mobility
• Suitable for an ad hocconfiguration of nodes–May or may not be able to
communicate with all other nodes
–Depends upon how far apartthe nodes are
10
802.11 - Supporting Mobility
• Nodes are free to move around–Set of directly reachable
nodes may change over time (partial connectivity)
–Topology changes over time
11
802.11 - Distribution System
• Nodes roam freely but operate within a structure– Tethered by wired network
infrastructure (Ethernet ?)
– Each Access Point (AP) services nodes in some region
– Each mobile node associates itself with an AP
12
802.11 - Distribution System
BH
A
F
G
D
AP-2
AP-3AP-1
EC
Distribution system
13
Managing Connectivity/Roaming
• How wireless nodes select Access Point ?
• Scanning (active search for an AP)– node sends Probe frame
– all AP’s within reach reply with Probe Response frame
– node selects one AP; sends it Associate Request frame
– AP replies with Association Response
– new AP informs old AP via wired backbone
14
Managing Mobility
• Scanning is used only when node has no AP
• Consider– Node moves out of range of it’s AP
and initiates scanning for new AP– Scanning takes 2+ round-trip times– No service until new AP found
15
Managing Mobility
• However–Node was probably in range of
new AP before leaving range of old AP
–Scanning too expensive to use continuously
• How handoff between APs managed as users move?
16
Managing Connectivity• Active scanning: when a node join or move• Passive scanning: AP periodically sends Beacon frame, advertising its capabilities
BH
A
F
G
D
AP-2
AP-3AP-1
EC
C
Distribution system
17
Managing Mobility
• Solution to handoff: AP’s send
periodic beacon frames
• Node may switch AP in response to beacon– if signal strength better from new AP
– using geographical information
– any other reason …
18
Managing Mobility
• 802.11 frames contain four addresses–actual sender and receiver
(source and dest.)
–AP for sender and receiver (routing in backbone)
19
Frame Format
• Control field contains three subfields:– 6-bit Type field (data, RTS, CTS, scanning);
– 1-bit ToDS; and
– 1-bit FromDS
• A single frame contains up to 2312 bytes of data
Addr4 PayloadSeqCtrlAddr3Addr2Addr1 CRC
0– 18,4964816 32484848
Duration
16
Control
16
ToDS=0, FromDS=0 C A
ToDS=1, FromDS=1 E AP-3 AP-1 A
20
Network Adaptors
21
Now you know …
• How direct link networks functions
• You know what the design issues are
• Most of the issues of direct link networks are handled by the network adaptors
• Lets have a quick look on design issues of these pieces of hardware
22
Overview
• Also called network interface card (NIC)
• Components (high-level overview)
• Options for use– Data motion
– Event notification
• Potential performance bottlenecks• Programming device drivers
23
Typical Workstation Architecture
CPU
Cache $
MemoryI/O bus
Networkadaptormemory
bus
communication ?
to network
Typically where data link functionality is implemented
24
Components of a Network Adaptor• Bus interface communicates with a specific host
– Bus defines protocol for CPU-adaptor communication
• Link interface speaks correct protocol on network– Implemented by a chip set, in software or on FPGA
• Buffering between different speed bus and link
Hos
t I/O
bus
Network Adaptor
Bus Interface
Link Interface
network
25
Host Perspective
• Adaptor is ultimately programmed by CPU
• Adaptor exports a Control Status Register (CSR)
• CSR is readable and writablefrom CPU at some memory address
26
Host Perspective
• CSR write: CPU instructs Adaptor (e.g., transmit)
• CSR read: Adaptor informs CPU (e.g., receive error)
• Host CPU either poll CSR or interrupted by OS– Interrupt handler routine disables
interrupts and is short
27
Data Motion Options for Network Adaptor Use
• Transfer frames between adaptor and host memory
• Programmed input/output (PIO)– Processor manages itself each
access (loads/stores)– Faster than DMA for small amounts
of data
28
Data Motion Options for Network Adaptor Use
• Direct memory access (DMA)– Adaptor gets buffer descriptor lists
by host for read/write– Processor is not involved: free to do
other things– Can be faster than memory copy
through CPU– Start-up cost
29
Data Motion - DMA
100
1400
1500
1500
1500
Bufferdescriptor
list
Memory buffers
…
30
Gather WriteScatter Read
DMA: Scatter Read, Gather Write
Adaptor memory
Host memory
Host memory
Adaptor memory
31
Data Motion - PIO
• Frames are buffered at adaptor until CPU transfers them
• Adaptors have 128-512 KB of expensive dual-port memory
Host
Adaptor
CPU
Memory
Memory
32
Data Motion
CPU
Cache $
MemoryI/O bus
Networkadaptormemory
bus
to network
Data movement path using PIO
Data movement path using DMA
33
Network Adaptor: Event Notification
• Hardware interrupts–Processor free to do other things
–Events delivered “immediately”
–State (register) save/restore expensive
–Context switches more expensive
34
Network Adaptor: Event Notification
• Event polling–Processor must periodically
check
–Events wait until next check
–No extra state changes
35
Device Drivers
• Operating system routines anchoring protocol stack to network hardware
• Initialize device, transmit frames, field interrupts
• Code contains device specific details– Difficult to read but simple in logic
36
Device Drivers
• Translates host’s message
into format expected by the device
• Sets CSR (e.g., to instruct device to transmit)
37
Programming Device Drivers
• Sample device driver in P&D sec 2.9.3• Better device driver(s) in Linux source
• Key features– Memory-mapped control registers
– Interrupt-driven
– Handler code must execute quickly
– Logically concurrent with other processes
38
Performance Bottlenecks
• Link capacity
• Processor computing power
• I/O bus bandwidth–Overhead involved in each bus
transfer
39
Performance Bottlenecks
• Memory bus bandwidth–Memory hierarchy with cache
levels
–Memory accesses results in multiple memory copies in different buffers
40
Memory Bandwidth on an Alpha
CPU
L1cache
L2cache
crossbar
Mainmemory
16 Gbps
4.48 Gbps
956 Mbps
800Mbps
I/O bus
41
Review Lecture 13
• 802.11 MAC CSMA/CA
• Hidden and exposed nodes
• MACA, RTS/CTS
• MACAW, ACK, BEB
• Mobility support, partial connectivity
• DS, AP, scanning, active passive
• Frame format
42
Review Lecture 13
• Network adaptors, Workstation arch• Components of NIC, Bus + link
interface, buffering, CSR• Data movement, PIO, DMA, Scatter
read, gather write• Event notification, interrupt, polling• Device drivers• Performance bottlenecks
11
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
2
Lecture No. 14
3
What we know …
• Elements of networks: nodes and links• Building a packet abstraction on a link• Transmission, and units of communication
data• Detecting transmission errors• Simulating an error-free, reliable channel
– Sliding window mechanism
• Arbitrating access to a shared medium• Design issues of direct link networks
– Functionality of network adaptors
4
What Next …?
• Moving on from direct to indirect networks
• Introducing switches which provide indirect connectivity
5
Switching and Forwarding
OutlineStore-and-Forward SwitchesBridges and Extended LANs Cell SwitchingSegmentation and Reassembly
6
Why Switching ?
• Motivation:– Why not just one direct link network ?
• Basic approach:– How can we extend the direct link
abstraction (provide illusion of one physical network) ?
• Challenges:– What problems must we address ?
7
Why Switching ?
• Examples:– Where are these issues addressed in real
networks ?
• Details of the switch:– What are the goals in design / how are
they addressed ?
• Heterogeneity– switching allows multiple physical netwrk
– but assume one switching strategy
8
After Switching – Are We Done ?
• Scale– direct link networks: O(100) hosts
– packet-switched networks: O(100,000) hosts
– Internet: O(2 year-1974) hosts
• Beyond the basics– quality of service
– congestion and performance analysis
– network trends and their importance
9
Connecting Large Networks
• Assert: want to use one direct link network
• Limitations of directly connected networks:– Limited Scale - number of hosts that can be
attached
• 1024 in Ethernet; only 2 in point-to-point link
– Limited geographical area that can be covered
• 2500 m in Ethernet; Point-to-point links also limited
10
Connecting Large Networks
• Alternative: provide illusion of one physical network
• Solution: Indirect connectivity by using switches– Packet switches in computer networks
control frame flow
– Multiple direct link networks, transparent to application
11
Packet Switches• A multi-input multi-
output device• Local star topology• Performance independent
of connectivity– (e.g. adding new host) if
switch is designed with enough aggregate capacity
• Maximum degree < physical network limit
12
Build Network from Stars
• Switches (or stars) to build networks that do not behave like in a star topology
13
Forwarding
• Packets arrive at one of the several inputs and have to be forwarded / switched to one of the available outputs– Connectionless and connection-oriented approach to
determine the correct output
Which way should it go ?
First challenge: forwarding
14
Routing
• Forwarding requires information
Second challenge:
routing
How to maintain forwarding information ?
15
Contention and Congestion
• If arrival rate for a certain output is greater than the output capacity, then contention occurs
• If arrival rate of packets is too high to cause buffer overflow, then congestion occurs
Who goes first ?
Any one is dropped ?
16
Challenges for Packet Switching
• Efficient forwarding– Switch with several output ports– Decide which output port to use
• Routing in dynamic network– Need information for forwarding– Construct and maintain the
information
17
Challenges for Packet Switching
• Handling contention–Multiple packets destined for
one output port–Decide which packet goes first–Decide what to do with others
18
Outline
• Switches and layered perspective
• Efficient forwarding
• Asynchronous transfer mode (ATM) example
• Switch fabrics and contention
19
Network Layers and Switches
One or more nodes
within the network
User level
OS kernel
host
switchswitch
between different physical layers
transport
network
data link
physical
session
presentation
application
network
data link
physical
20
Scalable Networks• Switch
– forwards packets from input port to output port– port selected based on address in packet header
• Advantages – cover large geographic area (tolerate latency)– support large numbers of hosts (scalable bandwidth)
Input ports
T3T3
STS-1
T3T3STS-1
Switch
Output ports
21
Packet Forwarding Analogy
• Process of going from one place to another
• Focus on decision process at intersections
• Path splits, how do you decide which way to go?
• How do you navigate at intersections? 3 scenarios:– from your office to home
– from home to a friend’s house (with directions)
– from Airport to the Hotel (without directions)
22
Packet Switching / Forwarding
• Forwarding: the task of selectingan appropriate output port for a packet
• Goals– Require limited information (both
packet and switch)
– Admit efficient implementation
23
Packet Switching / Forwarding
• Three approaches– Datagram or connectionless
approach
– Virtual circuit or connection-oriented approach
– Source routing
• Important notion: unique global address per host
24
Datagram Switching / Forwarding
• Every packet contains enough information– Enables switch to decide how to forward it
• Switch translates global address to output port– Maintains forwarding table for translations
• Each packet forwarded and travels independently
25
Datagram Switching / Forwarding
• No connection setup phase (connectionless model)
• Analogy:– Postal system: each packet contains
complete address for its destination
– Following signs (provided by switches) to reach destination
26
Datagram Switching• Managing tables in large, complex networks with
dynamically changing topologies is a real challenge for the routing protocol
01
3
2
0
13
2
0
13
2
Switch 3Host B
Switch 2
Host A
Switch 1
Host C
Host D
Host E
Host F
Host G
Host H
At switch 1:Dest Port#/InterfaceA 2B 1C 3D 0E 1… …
27
Datagram Switching• What happens if the destination is unknown ?• Network discards packet
– Possibly notifying the sender (“no route to host”)
01
3
2
0
13
2
0
13
2
Switch 3Host B
Switch 2
Host A
Switch 1
Host C
Host D
Host E
Host F
Host G
Host H
dataB
dataF
dataA
dataC
dataK
E C
B A
C F
D B
A K ? ?
28
Datagram Model
• No round trip time delay waiting for connection setup– Host can send data anywhere, anytime as soon as it
is ready– Source has no way of knowing if the network is
capable of delivering a packet or if the destination host is even up
• Packets are treated independently– Possible to route around link and node failures
dynamically
29
Datagram Model
• Every packet carry full address of the destination– Overhead per packet is higher than for
the connection-oriented model
– Global address to path translation requires storage
– Might not be possible to deliver packet (dest unknown)
30
Virtual Circuit Switching
• Explicit connection setup (& tear-down) phase from source to destination: connection-oriented model– Subsequence packets follow established
circuit
• Supporting “connections” in network
layer may be useful for service notions
31
Virtual Circuit Switching
• Each switch maintains a VC table(connection state) per-link or per-switch
• Analogy– Phone call: each packet follows an
established path
– Following a known route to reach the destination
32
VC Tables in VC Switching
• VC table contains information for each connection– incoming / outgoing interface (port)
– incoming / outgoing VCI (virtual circuit identifier)
• Permanent (PVC) or switched (signaled) virtual circuit (SVC)
33
VC Tables in VC Switching
• Setup message in signaling process (to create VC table) is forwarded like a datagram
• Acknowledgment of connection setup to downstream neighbors to complete signaling– Data transfer phase can start after
ACK is received
34
Signaling in VC Switching
• Setup message is forwarded from Host A to Host B
• On connection request, each switch creates an entry in VC table with a VCI for the connection
013
2
2
1
3
0
0
13
2
Switch 3Host B
Switch 2
Switch 1
Host A
I/F VCI I/F VCIin in out out
setup B
setup B
setup B
setup B
2 5 1
I/F VCI I/F VCIin in out out
2 7 3
I/F VCI I/F VCIin in out out
3 9 0
35
Signaling in VC Switching
• Host B accepts connection from Host A, and sends back an ACK
• In ACK, everyone communicates its choice of VCI to its upstream neighbor
013
2
2
1
3
0
0
13
2
Switch 3Host B
Switch 2
Switch 1
Host A
I/F VCI I/F VCIin in out out
ACK 4
2 5 1
I/F VCI I/F VCIin in out out
2 7 3
I/F VCI I/F VCIin in out out
3 9 0
ACK 7
ACK 9
ACK 5
4
7
9
36
Data Transfer in VC Switching
• Host A knows that everything is in place all the way to Host B
• In data packets, each node then puts the VCI of its downstream neighbor
013
2
2
1
3
0
0
13
2
Switch 3Host B
Switch 2
Switch 1
Host A
I/F VCI I/F VCIin in out out
data 5
2 5 1 9
I/F VCI I/F VCIin in out out
2 7 3 4
I/F VCI I/F VCIin in out out
3 9 0 7
data 9data 7
data 4
37
Virtual Circuit Model
• Typically wait full RTT for connection setup before sending first data packet– Can not avoid failures dynamically;
must re-establish connection (old one is torn down to free storage space)
38
Virtual Circuit Model
• Each data packet contains only a small identifier, making the per-packet header overhead small– Global address to path information still
necessary
• Connection setup provides an opportunity to know network conditions and/or reserve resources.– Avoiding congestion but under-utilizing
switch
39
Review Lecture 14
• Direct to indirect nets: switches provide ic• Motivation, approach, challenges• Heterogeneity, scale, QoS, congestion• Limits: scale, area, Sol: illusion of 1 net• Packet switches: MIMO, star• Challenges: forwarding, routing, contention• Forwardg: decision process at intersections,
selecting appropriate output port
40
Review Lecture 14
• 3 approaches: DG, VC, source
• pkt cont enough info, switch translates addr to out port, pkt forwarded independen, no con setup ph, no RTT delay, dest unknown, full addr overhead, route around failure
• Con setup, pkt follow establi circuit, switch maintain VC table, signaling (in datagram) + ack, wait 1 RTT for data, not avoid failures, small overhead, reserve resources
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 15
3
Source Routing
• Packet header contains sequenceof address/ports on path from source to destination– One direction per switch: port, next
switch; (absolute)
– Switches read, use, and then discard directions
4
Source Routing
• All forwarding/topology information required to switch a packet is provided by source host
• Used in some system area networks (SANs)
• Directions may be rotated instead of discarding
5
Data Transfer in Source Routing
• Analogous to following directions
0
13
2
2
1
3
0
0
13
2
Switch 3
Host B
Switch 2
Switch 1
Host A
data 0 1 3
data 3 0 1
data 1 3 0
data 3 0 1
data 1 0 3
data 2 3 0 1
6
Source Routing Model
• Source host needs to know the correct and complete topology of the network– Changes must propagate to all hosts
• Packet headers may be large and variable in size: the length is unpredictable
7
Source Routing Model
• Each switch needs to correctly and efficiently manipulate the header information– Rotation or stripping of address
– Pointer to current address
• Can be used in datagram or virtual circuit networks
8
Forwarding Performance
• Assume switch is– General-purpose workstation– With DMA support– Multiple network adapters (NIC’s)
• Switching process– Packet arrives on NIC 1– NIC 1 DMA’s packet into memory
– CPU looks at header, decides to send on NIC 2– NIC 2 DMA’s packet into NIC 2 memory
– Packet leaves via NIC 2
9
Implementation and Performance
• Packet arriving at interface 1 has to go on interface 2• Point of contention for packets: I/O and memory bus
CPU
Main memory
I/O bus
Interface 1
Interface 2
Interface 3
10
Implementation and Performance• The cost of processing small packets
(parsing headers, deciding output port) dominates other restrictions– Throughput = packets/sec x bits/packet
• Moving data from inputs to outputs in parallel may increase the aggregate throughput
• Potential bottlenecks– I/O bus bandwidth– Memory bus bandwidth– Processor computing power
11
Bridges and Extended LANs
12
Building Extended LANs
• Traditional LAN– Shared medium (e.g., Ethernet)– Cheap, easy to administer– Supports broadcast traffic
• Problem– Want to scale LAN concept
• Larger geographic area (> O(1 km))• More hosts (> O(100))
– But retain LAN-like functionality
• Solution: bridges
13
Bridges• Connect two or more LANs with a bridge
– Transparently extends a LAN over multiple networks
– Accept & forward strategy (in promiscuous mode)– Level 2 connection (does not add packet header)
A
Bridge
B C
X Y Z
Port 1
Port 2
14
Bridges vs. Switches• Switch
– Receive frame on input port– Translate address to output port– Forward frame
• Bridge– Connect shared media– All ports bidirectional– Repeat subset of traffic
• Receive frame on one port• Send on all other ports
15
Uses and Limitations of Bridges
• Extend LAN concept
• Limited scalability– To O(1,000) hosts
– Not to global networks
• Not heterogeneous– Some use of address, but
– No translation between frame formats
16
Learning Bridges
• Trivial algorithm– Forward all frames on all (other) LAN’s
– Potentially heavy traffic & processing overhead
• Optimize by using address information– “Learn” which hosts live on which LAN
– Maintain forwarding table
– Only forward when necessary (dest. not on same LAN)
– Reduces bridge workload
17
Learning Bridges
• Learn table entries based on source address– Timeout entries to allow movement of hosts
• Table is an optimization; need not be complete• Always forward broadcast frames• Uses datagram or connectionless forwarding
A
Bridge
B C
X Y Z
Port 1
Port 2
Host PortA 1B 1C 1X 2Y 2Z 2
18
Learning Bridges
• Problem– Redundancy (desirable to handle failures, but …)
– Makes extended LAN structure cyclic– Frames may cycle forever
• Solution: spanning tree
B3
A
C
E
DB2
B5
B
B7 K
F
H
B4
J
B1
B6
G
I
19
Spanning Tree
• Subset of forwarding possibilities• All LAN’s reachable, but
• Acyclic• Bridges run a distributed algorithm to
calculate the spanning tree– Select which bridge actively forward– Developed by Radia Perlman of DEC– Now IEEE 802.1 specification– Reconfigurable algorithm
20
Spanning Tree Concept
• LAN’s and bridges make a bipartite graph
• Ports are edges connecting LAN’s to bridges
• Spanning tree required– Connect all LAN’s: all vertices of graph are covered
– Can leave out bridges: all edges may not be covered
21
Spanning Tree Algorithm
• Each bridge has a unique, totally-ordered identifier
• Select bridge with lowest ID as root bridge
22
Spanning Tree Algorithm
• Each bridge determines– Direction of shortest path to root
(preferred port)
– For each connected LAN, is it the designated bridge?• Select bridge on each LAN closest to root as
designated bridge
• Use ID (lowest) to break ties)
– Ports connecting LAN’s to designated
bridges called designated ports
23
Spanning Tree Algorithm
• All designated bridges forward frames– On all designated ports
– On preferred port (path leading to root)
B3
A
C
E
DB2
B5
B
B7 K
F
H
B4
J
B1
B6
G
I
B2
LAN
Designated port
Preferred port
Designated bridge
24
Distributed Spanning Tree Algorithm
• Bridges exchange configuration messages– ID for bridge sending the message
– ID for what the sending bridge believes to be root bridge
– Distance (hops) from sending bridge to root bridge
25
Distributed Spanning Tree Algorithm
• Initially, each bridge believes it is the root– Sends a configuration message, and
checks if any received message is better than the current best message
• Each bridge records current best configuration message for each port
26
Distributed Spanning Tree Algorithm
• Bridges forward configuration messages outward from root bridge i.e., on all designated ports
• Bridge assumes it is designated bridge for a LAN until it learns otherwise
B3
A
C
E
DB2
B5
B
B7 K
F
H
B4
J
B1
B6
G
I
27
Algorithm Details
• In steady state, only designated bridges forward configuration messages– Outward from root bridge, to all
designated ports
– Until they learn they are not designated bridge
28
Algorithm Details
• In steady state, only root generates configuration messages periodically– Timeout restarts algorithm (claiming “I
am root …”)
• Although algorithm is reconfigureable, it is not possible to forward frames over alternative paths
29
Broadcast and Multicast
• Forward all broadcast/multicast frames to all preferred and designated ports– Current practice– Lets hosts decide whether or not to
accept frame
30
Broadcast and Multicast
• Alternative: extend learning to handle groups– Learn when no group members
downstream
– Group members periodically identify themselves
– Accomplished by having each group member sending a frame to the bridge with group address in source field
31
Limitations of Bridges
• Do not scale– Broadcast does not scale
– Spanning tree algorithm does not scale
• Do not accommodate heterogeneity– Only supports networks with same
address formats
32
Limitations of Bridges
• Caution: beware of transparency– Frame drop because of bridge
congestion
– Large and variable latency between two hosts
– Frames may reorder in extended LANs
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 16
3
Where we are now …
• Understand different ways to move through network (forwarding)– Read signs at each switch (datagram)
– Follow a known path (virtual circuit)
– Carry instructions (source routing)
• Bridge approach to extending LANconcept
4
Where we are now …
• Next–Example of a real network
(ATM)
–How switches are built and contention within switches
5
ATM (Asynchronous Transfer Mode)
• Defined by ATM Forum (formed in Oct. 1991)– Telephone industry (link providers to
build networks)
– Data network industry
• High speed switching technology: right thing at right place at right time ? ? ?
6
ATM (Asynchronous Transfer Mode)
• Common in WANs, can also be used in LANs– Competing technology with Ethernet, but
areas of application only partially overlap
• Connection-oriented packet-switched network– Virtual-circuit routing
• Typically implemented on SONET(other physical layers possible)
7
ATM (Asynchronous Transfer Mode)
• Signaling (connection setup) Protocol: Q.2931– Discovering routes and allocating
resources at switches
• ATM address format– E.164 and NSAP (Network Service
Access Point)
– Different from MAC addresses
8
ATM Signaling
• Connection setup called signaling (standard Q.2931)
• Route discovery, resource resv, QoS, ...• Send through network
– Request setup circuit– Send setup frame on setup circuit
• Establish locally– No intermediate switch involvement– Requires pre-established virtual path
9
Cell Switching (ATM)
• Fixed length (53 bytes) frames are called cells– 5-byte (header + 1-byte CRC-8) + 48-
byte payload
• Standard defines 3 layers (5 sublayers)– Layers interface to physical media and
to higher layers (e.g., encapsulating variable-length frames)
10
Cell Switching (ATM)
• 2-level connection hierarchy–Virtual circuits
–Virtual paths
•Bundles of virtual circuits
•Travel along common route
•Reduces forwarding information
11
Why Hierarchical Connections ?• Simpler ...• Setup
– New virtual circuits follow virtual path route
• Forwarding– Virtual path identifier (VPI) used between
switches (smaller forwarding table)– VCI used for last hop (to host)
• Rerouting around failures– Change virtual path once vs. 64k virtual
circuits
12
Variable-length Frames …
• Small data is sent in a minimum-sized packet– No need for extraneous padding
• Large file is sent by breaking it into many maximum-sized packets– Low overhead (header to data bytes
ratio), increasing bandwidth efficiency
– Minimize total number of packets sent• Minimize per-packet processing
13
Drawbacks of Fixed-length Frames
• No optimally good fixed length
• Higher overhead for large frames– Must be fragmented
– One header per fragment (less efficient)
• Low utilization for small frames– Must be padded
14
Then Why Fixed-length Frames ?
• Require simpler hardware …
– Facilitates the impl of hardware switches
– Helpful building fast, highly scalable switches
• Easier to build hardware doing simple job: to process known-length frame
• Parallelism in processing stages: lots of switches doing same thing in parallel– Analogy: processor instruction pipeline with
variable-length stages
15
Why Short Frames ?• Better behavior of non-preemptive
queues
• Reduced granularity of preemption– High-priority frame may wait for max-
size frame
• Long frame (4kB) admits long wait
• Short frame limits wait
– Limits end-to-end jitter, or variance in latency
16
Why Short Frames ?
• Shorter queues–Switches typically store and
forward packets
–Cannot send until full packet arrives
–Short frames (fragmentation) allows first part to be sent while remainder arrives
17
Queuing Behavior Examples
• Consider 4kB vs. 53B frames, 100 Mbps link
• Preemption: high-priority frame arrives just as switch starts sending low-priority frame– 4kB: wait for 4096 x 8 / 100 =
327.68 microseconds
– 53B: wait for 53 x 8 / 100 = 4.24 us
18
Queuing Behavior Examples
• Shorter queues: two chunks (or frames) arrive simultaneously at time 0– 4kB: link is idle until all data arrive at
time 327.68 us; 8 kB left to send
– 53B: link nearly fully utilized (waits 4.24us); at time 327.68 microseconds, roughly 4kB left to send
19
Why 53-byte Frames ?
• Telephone community wish: carry voice effectively
• Demands ATM to improves latency for audio data– Voice encoded at 64kbps: 8-bit smpl at 8KHz
– Need full cell’s worth of samples before
sending cell• 1Kbyte cells -> 125ms per cell (human detectable)
• 53 byte cells implies 6 ms of data
20
Why 53-byte Frames ?
– Smaller latency implies no need for echo cancellers
– Audio reconstruction• Expect low rate of cell loss; can interpolate loss (6 ms)
• Compromise: 48 bytes = 32(Europe) + 64(US) / 2
21
ATM Cell Format• User-Network Interface (UNI)
– Host-to-switch format – GFC: Generic Flow Control (still being defined)– VCI/VPI: Virtual Circuit/Path Identifier– Type: management, congestion control, AAL5 (later)– CLP: Cell Loss Priority – HEC: Header Error Check (CRC-8)
• Network-Network Interface (NNI)– Switch-to-switch format– GFC becomes part of VPI field
GFC VPI VCI Type CLP HEC(CRC-8) payload
4 16 3 18 384 (48 bytes)8
22
Segmentation and Reassembly
• Variable-length packets passed to ATM are often larger than 48 bytes– May not fit in the ATM payload
• Fragmentation is required at the source– High-level message into low-level
packets
23
Segmentation and Reassembly
• Destination reassembles the fragments– Transforms fragments back into
the message
• ATM name this procedure as Segmentation and Reassembly (SAR)
24
Segmentation and Reassembly• ATM Adaptation Layer (AAL)
– Application to ATM cell mapping– AAL header contains information for reassembly– AAL1, AAL2 for applications needing guaranteed rate– AAL3/4 designed for variable-length packet data– AAL5 is an alternative standard for packet data
AAL
ATM
AAL
ATM
… …
25
ATM Layers• ATM Adaptation Layer (AAL)
– Convergence Sublayer (CS) supports different application service models
– Segmentation and Reassembly (SAR) supports variable-length frames
• ATM Layer– Handles virtual circuits, cell header
generation, flow control
• Physical layer– Transmission Convergence (TC)
handles error detection, framing– Physical medium dependent (PMD)
sublayer handles encoding
ATM
AALCS
SAR
PHYTC
PMD
26
AAL 3/4• Provides information to allow variable size packets
to be sent in fixed-size ATM cells• Convergence Sublayer Protocol Data Unit (CS-PDU)
– CPI: Common Part Indicator (version field)– Btag/Etag:beginning and ending tags (same)– BAsize: hint on reassembly buffer space to allocate – Length: size of whole PDU
• Segmented into cells: header/trailer + 44-byte data
CPI Btag BAsize payload Pad 0 Etag Length
8 16 0-24 88 < 64 KB 8 16
27
ATM Cell Format for AAL 3/4
• Type (is-start? and is-end? bits)– BOM (10): beginning of message – COM (00): continuation of message– EOM (01): end of message– SSM (11): single-segment message
• SEQ: sequence number (for cell loss/reordering)• MID: multiplexing ID (mux onto virtual circuits)• Length: number of bytes of PDU in this cell
ATM header type seq MID payload length CRC-10
40 4 352 (44 bytes) 62 10 16
28
Encapsulation and Segmentation for AAL3/4
44 bytes 44 bytes 44 bytes <44 bytes
ATM header
AAL header Cell
payload
AAL trailer Padding
CS-PDU header
User data CS-PDU trailer
< 64 KB 4-7 bytes4 bytes
29
AAL 3/4 Summary• Many factors limit the utilization to 83%• Only 44 bytes of data in 53 bytes of cell• CS-PDU encapsulation further reduces
the efficiency (header-to-payload ratio)• Partial filling of last cell also impacts
the effective utilization• Lot of overhead to perform simple
function of segmentation and reassembly …
30
AAL 5• AAL 5 designed as replacement for AAL3/4
• No PDU information in ATM header– Only uses 1 bit of ATM header to mark end of PDU
– Does not require additional per-cell headers/trailers
– No additional layer of multiplexing in a single VC
• CS-PDU contains data with only 8-byte trailer– Uses stronger error correct at PDU level (CRC-32)
– Protection against lost, corrupt and misordered cells is provided by CS-PDU
31
AAL 5 CS-PDU
• CS-PDU Format
– Pad so trailer always falls at the end of ATM cell
– Length: size of PDU (data only)
– CRC-32 (detects missing or misordered cells)
• Cell Format– End-of-PDU bit in Type field of ATM header
0 - 47 2< 64 KB 2 32
data pad reserved length CRC-32
32
Encapsulation and Segmentation for AAL 5
User data
48 bytes 48 bytes 48 bytes
ATM header Cell payload
Padding
CS-PDUtrailer
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 17
3
Virtual Paths with ATM• Two level hierarchy of virtual connection: 8-bit
VPI and 16-bit VCI– Switches in the public network use 8-bit VPI– Corporate sites use full 24-bit address (VPI + VCI)– Much less connection-state info in switches– Virtual path: fat pipe with bundle of virtual circuits
Public netw ork
Netw ork BNetw ork A
4
Physical Layers for ATM
• ATM may run over several phy media
• ATM was assumed to run over SONET but both are entirely separable entities
• ATM frame boundaries to be correctly identified– Successive 53-byte ATM frames in payload
– SONET overhead byte points to the payload
– Another way is to calculate CRC (5th byte of the cell)
5
ATM and LANs
• ATM grew out of the telephone community and later used for computer communication
• Significant advantage of performance and better scalability of switched over shared media
• No distance limitation in ATM making it a good choice for high-performance LAN backbone
• Point-to-point, long distance Gigabit Ethernet is a competing technology with ATM
6
ATM as a LAN Backbone
• Different from traditional LANs; no native support for broadcast or multicast
E1
H5
H6
H7
H1E3
H2
H4
H3E2
ATM linksEthernet links
Ethernet sw itch
ATM sw itchATM-attachedhost
7
ATM in a LAN
• How to broadcast to all nodes on an ATM LAN ?–Without knowing all the
addresses
–Without setting up VC to all of them
8
ATM in a LAN• Two solutions
– Redesign protocols that consider LAN different from what ATM can provide (e.g. ATMARP)
– Make ATM behave like shared media, without loosing performance advantage of switched media (e.g. LANE)• ATM address is different from a unique
48-bit MAC address
9
Shared Ethernet Emulation with LANE
• All hosts think they are on the same Ethernet
LANE / EthernetAdaptor Card
HH
H
HH
EthernetSwitchATM Switch
LANE / EthernetAdaptor Card
HH
H
HH
EthernetSwitchATM Switch
10
LAN Emulation (LANE) with ATM
• Transparent shared media emulation of ATM
• Adds (not changes) functionality to ATM switches
• Each device needs a global MAC address, as well as an ATM address to establish a VC
11
LAN Emulation (LANE) with ATM
• Devices connect as LAN Emulation Clients (LEC)
• LANE provides Ethernet-like interface to LECs
• Similar solutions for other networks: VPNs on WANs, VLANs on large, switched Ethernets
12
ATM / LANE Protocol Layers
Higher-layerprotocols
(IP, ARP, . . .)
Signalling+ LANE
AAL5
ATM
PHY
ATM
PHY PHY
Higher-layerprotocols
(IP, ARP, . . .)
Signalling+ LANE
AAL5
ATM
Host Switch Host
PHY
Ethernet-likeinterface
13
Clients and Servers in LANE
• LAN Emulation Client (LEC)–Host, bridge, router or switch
• LAN Emulation Server (LES)–Maintains client’s MAC and
ATM addresses
–Maintains ATM address of BUS
14
Clients and Servers in LANE
• LAN Emulation Configuration Server (LECS)– High-level network management
when LEC starts up
– Reachable by preset VC (recall known server port#)
– Maintains mapping of ATM address to LANE type
15
Clients and Servers in LANE
• Broadcast and Unknown Server (BUS)– Emulates broadcast and multicast; critical to LANE– Uses point-to-multipoint VC with all clients
• Servers physically located in one or more devices
H2H1
BUSLESATM netw ork
Point-to-point VC
Point-to-multipoint VCLECS
16
LANE Registration
1. Client contacts LECS on predefined VC, and sends ATM address to it
2. LECS returns LAN type, MTU and ATM address of LES
3. Client signals connection to LES, and registers MAC and ATM addresses with LES
4. LES returns ATM address of BUS5. Client signals connection to BUS6. Bus adds client to point-to-multipoint
VC
ATM Network
LECS
LES BUS
H1 H2
H3
17
LANE Circuit Setup
1. Client (H1) knows destination MAC address of receiver (H2)
2. Client (H1) sends 1st packet to BUS
3. BUS sends address resolution request to LES
4. LES returns ATM address to client (H1)
5. Client (H1) signals connection to H2 for subsequent packets
ATM Network
LECS
LES BUS
H1 H2
H3
Switches: The Intersections
19
The Intersections
Design intersection to accommodate traffic flows
RawalpindiSaddar
Airport
Pir Wadhai
IslamabadZero Point
Rawal Dam
FaizabadFaizabad FlyoverAyub
Park
20
Contention in Switches• Some packets destined for same output
– One goes first– Others delayed or dropped
• Delaying packets requires buffering– Finite capacity, some packets must still drop– At inputs
• Increases/adds false contention• Sometimes necessary
– At outputs– Can also exert “backpressure”
21
Output Buffering
1x6 Switch
x
a
Standard check-in linesCustomer
service
trying to check-inyou Mr. X
writing complaint
letter
Mr. A waiting to
claim refund of Rs.100
22
Input Buffering: Head-of-line Blocking
1x6 Switch
x
a
Standard check-in linesCustomer
service
trying to check-in
you
Mr. X writing
complaint letter
Mr. A waiting to
claim refund of Rs.100
agents are standing by !
23
Backpressure
1x6 Switch
x
a
Standard check-in linesCustomer
service
trying to check-inyou i
“no more,
please”
propagation delay requires that switch exerts backpressure before buffer is full; thus used
in networks with small propagation delay
24
Backpressure
• Propagation delay requires that switch 2 exert backpressure at high-water mark rather than when buffer completely full
• It is thus typically only used in networks with small propagation delays (e.g., switch fabrics)
Switch 1 Switch 2
“no more, please”
25
Switching Hardware
• Multi-input multi-output device, getting packets from inputs to the outputs as fast as possible
• Performance of a switch is limited by I/O bus bandwidth (each packet traverse twice)– 1Gbps I/O bus can support ten T3 (45 Mbps) links,
three STS-3 (155 Mbps) links, and not even one STS-12 (625 Mbps) link
• Success or failure of a new protocol depend on whether it takes advantage of switch’s
capabilities
26
Switching Fabric• Special-purpose (switching) hardware
• General problem– Connect N inputs to M outputs (NxM switch)
– Often N=M (bidirectional links)
• Design goals– High throughput: want aggregate close to
MIN (sum of inputs, sum of outputs)
– Avoid contention (fabric faster than ports)
– Good scalability:linear size/cost groth in N/M
27
Switching Fabric and Ports
Inputport
Inputport
Inputport
Inputport
Outputport
Outputport
Outputport
Outputport
Fabric
Switchfabric
SwitchFabric
Avoid contention
here
28
Switch: Fabric and PortsFabric has a job to deliver packets to the right output
Inputport
Inputport
Inputport
Inputport
Outputport
Outputport
Outputport
Outputport
FabricSwitchfabric
(with small internal
buffering)
29
Ports and Fabric
• Ports deals with the complexity of the real world– Virtual circuit management is handled in
ports
– Determine outpt port using forwarding tables
• Input port is the first in performance bottlenecks– Header processing and handling packet to
fabric
30
Ports and Fabric
• Buffering is required at ports– Buffer management has profound
impact on performance
– Internal (in fabric) or output buffering is normally used
• Fabric: simply move packets from inputs to outputs
31
Design Goals - Throughput
• An n x m switch can provide max ideal throughput of S = S1 + S2 + ……… + Sn
– Only possible if traffic at inputs is evenly distributed across all outputs
– Sustained throughput higher than link speed of output is not possible
32
Design Goals - Throughput• Variable size packets affect performance
– Some operations have constant overhead per packet
– Switch performs differently for different sizes of packets
– Packet per second (pps) rate is also important
• Most switches are subject to internalcontention– Determine performance under diff traffic loads
33
Design Goals - Throughput
• Traffic models are important to throughput– Arrival time, output port, packet length
– Extremely difficult to achieve accurate models
– Traffic-modeling very successful in telephony
• Designers now expect high range of throughputs– In order to handle a steady stream of 64-byte
packets, a 40Gbps switch need a rate of 78M pps !!!
34
Design Goals - Scalability
• Cost of hardware rises fast with increasing the number of ports n– Adding ports increases hardware & design
complexity
– Scalability in terms of rate of increase in cost
• Design complexity determines maximum switch size– Switch designs run into problems at some
maximum number of inputs and outputs
35
Switch Performance• Avoid contention with buffering
– Use output buffering when possible– Apply backpressure through fabric– Input buffering with “peeking” (non-FIFO
semantics) to reduce head-of-line blocking problems– Drop packets if input buffer overflows
• Good scalability– O(N) ports– Port design complexity O(N) gives O(N2) for switch– Port design complexity O(1) gives O(N) for switch
36
Crossbar (“Perfect”) Switch
• Problem: hardware scales as O(N2)
37
Knockout Switch: Pick L from N
• Problem: what if more than L arrive
1
2
3
4
OutputsInputs
2x2 random selector
delay unit
8-to-4 concentrator
38
Shared Memory Switch
Mux Buffer memory Demux
Writecontrol
Readcontrol
Inputs Outputs
… …
39
Self-Routing Fabrics• Use source routing on “network” within
switch
• Input port attaches output port number as header
• Fabric routes packet based on output port
• Types– Banyan network
– Batcher-Banyan network
– Sunshine switch
40
Banyan Network
• No contention if inputs are sorted and uniqueMSB LSB
Sends 0 bit upSends 1 bit down
41
Banyan Network
• Sends 0 bit up, 1 bit down
001
011
110
111
001
011
110
111
MSB LSB
42
Batcher (Merge Sort) Network
Routing packets through a Batcher network
• Batcher-Banyan Network– Attach the two-back-to-back– Arbitrary unique permutations routed without
contention
7 3
3 7
3 3
6 6
3 1
1 3
6 6
1 1
7 1
1 7
6 6
7 7
Sort Merge Merge
43
Batcher-Banyan Network
sends 1 bit upsends 0 bit down
sends 0 bit upsends 1 bit down
44
Sunshine Switch
• Like a Knockout switch, except
• Recirculates overflow packets i.e., when more than L arrive in one cycle
Delay
Inputs Batcher Trap SelectorOutputs
nnn
n
kk
n + kn + kl banyans
nnn(marks
overflow packets)
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 18
3
Internet Protocol (IP)• Network protocol for the Internet• Operates on all hosts and routers (routers connect
distinct networks into the Internet)
…
TFTPNVHTTPFTP
UDPTCP
IP
FDDI Ethernet ATM
4
Internetwork
• Concatenation of networks
• Protocol stack
Network 1
Ethernet
Network 3
FDDI
Network 4
Ethernet
R1
R2
R3
H8H2 H3
H1
H4
H5
H6 H7
Network 2
Point-to-
point
R1
H1
TCP
IP
ETH ETH PPP
IP
R2
PPP FDDI
IP
R3
FDDI ETH
IP
H8
TCP
IP
ETH
5
IP Addresses
– 18.10.5.22 host in class A network (MIT)– 130.126.143.254 host in class B network (UIUC)– 192.12.70.111 host in class C network
• More recent classes– Multicast (class D): starts with 1110– Future expansions (class E): starts with 1111
Network Host
7 bits (126 nets) 24 bits (16 million hosts)
0Class A:
Network Host
14 bits (16k nets) 16 bits (64K hosts)
1 0Class B:
Network Host
21 bits (2 million nets) 8 bits (256)
1 1 0Class C:
6
Datagram Format
• 4-bit version (4 for IPv4, 6 for IPv6)
• 4-bit header length (in words, minimum of 5)
• 8-bit type of service (TOS); more or less unused
• 16-bit datagram length (in bytes)
• 8-bit protocol (e.g., TCP=6 or UDP=17)
Version HLen TOS Length
Ident Flags Offset
TTL Protocol Checksum
SourceAddr
DestinationAddr
Options (variable) Pad(variable)
0 4 8 16 19 31
Data
7
Datagram Format
• Fragmentation support– 16-bit packet ID (identifies packet fragments)– 3-bit flags; one bit marks last fragment– 13-bit fragment offset into packet (in 8-byte words)
• 8-bit “time-to-live” (TTL); a hop count until forced destruction of packet
Version HLen TOS Length
Ident Flags Offset
TTL Protocol Checksum
SourceAddr
DestinationAddr
Options (variable) Pad(variable)
0 4 8 16 19 31
Data
8
Datagram Format• 16-bit IP checksum on header• 32-bit source IP address• 32-bit destination IP address• Options and padding (variable length)
– Source-based routing (typically disabled)– Record route– Several others
Version HLen TOS Length
Ident Flags Offset
TTL Protocol Checksum
SourceAddr
DestinationAddr
Options (variable) Pad(variable)
0 4 8 16 19 31
Data
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 19
3
Fragmentation and Reassembly Example
H1 R1 R2 R3 H8
ETH IP (1400) FDDI IP (1400) PPP IP (512)
PPP IP (376)
PPP IP (512)
ETH IP (512)
ETH IP (376)
ETH IP (512)
Ident= x Offset= 0
Start of header
0
Rest of header
1400 data bytes
Ident= x Offset= 0
Start of header
1
Rest of header
512 data bytes
Ident= x Offset= 64
Start of header
1
Rest of header
512 data bytes
Ident= x Offset= 128
Start of header
0
Rest of header
376 data bytes
Datagram Forwarding
4
Network Number Next Hop
1 R3
2 R1
3 Interface 1
4 Interface 0
5
Datagram Forwarding
Network # Netmask Next Hop/Port
18.0.0.0 255.0.0.0 1128.32.0.0 255.255.0.0 20.0.0.0 0.0.0.0 3
dest: 18.26.10.0 mask with 255.0.0.0 matched! send to port 1
dest: 128.16.14.0 mask with 255.0.0.0 not matchedmask with 255.255.0.0 not matchedmask with 0.0.0.0 matched! send to port 3
6
ARP Packet Format
TargetHardwareAddr (bytes 2 – 5)
TargetProtocolAddr (bytes 0 – 3)
SourceProtocolAddr (bytes 2 – 3)
Hardware type = 1 Protocol Type = 0x0800
SourceHardwareAddr (bytes 4 – 5)
TargetHardwareAddr (bytes 0 – 1)
SourceProtocolAddr (bytes 0 – 1)
HLen = 48 PLen = 32 Operation
SourceHardwareAddr (bytes 0 – 3)
0 8 16 31
7
ARP in ATM
H2
R
H1
LIS 10
LIS 12
ATM netw ork
10.0.0.2
10.0.0.112.0.0.3
12.0.0.5
8
Internet Control Message Protocol (ICMP)
• IP companion protocol (not necessary)• Handles error and control messages
…
TFTPNVHTTPFTP
UDPTCP
IP
FDDI Ethernet ATM
ICMP
9
ICMP Message• Sent to the source when a node is unable to
process IP datagram successfully• Error messages
– Destination unreachable (protocol, port, or host)– Reassembly failed– IP Checksum failed; or invalid header– TTL exceeded (so datagrams don’t cycle forever)
– Cannot fragment
• Control messages– Echo (ping) request and reply– Redirect (from router to source host, to change
route)
10
Dynamic Host Configuration Protocol- DHCP
• DHCP server is required to provide configuration information to each host– Each host retrieve this information on bootup
• DHCP server can be configured manually, or it may allocate addresses on-demand– Addresses are “leased” for some period of time
• Each host is not configured for DHCP server, it performs a DHCP server discovery– A broadcast discovery message is sent by the host
and a unicast reply is sent by the server
11
DHCP Server
DHCPrelay
DHCPserver
Other networks
Unicast to server
Broadcast
Host
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 20
3
Virtual Private Networks - VPNC
A
Corporation X private netw ork
B
K L
M
Corporation Y private netw ork
C
A B
K L
M
Physical links
Physical links
Virtual circuits
(a)
(b)
4
IP Tunneling
IP Multicast
IP Multicast
IP Multicast
VU UET
MAJU
IPv4
Data IP
Data IP IP
5
IP Tunnel in VPNs
• Virtual point-to-point link between a pair of nodes separated by many networks
IP header,Destination = 2.x
IP payload
IP header,Destination = 10.0.0.1
IP header,Destination = 2.x
IP payload
IP header,Destination = 2.x
IP payload
Network 1 R1 Internetwork Network 2R2
10.0.0.1
6
Routing
• A stranger appears and asks “Airport ?”
• Which way do you point ?
Rawalpindi
Airport
Pir WadhaiIslamabad
Rawal Dam
Faizabad
7
Routing Overview
4
3
6
21
9
1
1D
A
FE
B
C
8
Bellman-Ford Algorithm• For every node N
– For each directed pair (B,C)• Is the path B N …C better than B .C ?• Is cost BNdest smaller than previously known?
• For N nodes– Uses an NxN matrix of (distance, successor) values
9
Bellman-Ford Algorithm
• After n iterations, nodes at distance n hops along the shortest path have correct information
6 2
2 5
1 1 1
Source
Destination
A B
C E
Infinity Infinity Infinity
InfinityInfinity
Infinity Infinity 1 Dest
5 DestInfinity
1 Dest3 BInfinity
5 Dest7 E
1 Dest3 B8 C
4 A7 E
8 C 3 B 1 Dest
4 A6 E
10
Dijkstra’s Algorithm
• From node N– Start with S = {N} and one-hop paths from N– Loop n-1 times
• Add closest outside node M to S• For each node P not in S
– Is the path N ..... M ..... P better than N ..... P ?
11
Dijkstra’s Algorithm
1
1
1
2 22
223
3 3
3
4
4
6
6
77
810
9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 21
Distance Vector Routing Example
Information in routing table of each node:Iteration 1
At Distance to reach nodenode A B C D E F G A 0 1 1 x 1 1 xB 1 0 1 x x x xC 1 1 0 1 x x xD x x 1 0 x x 1E 1 x x x 0 x xF 1 x x x x 0 1G x x x 1 x 1 0
D
G
A
F
E
B
C
Distance Vector Routing Example
Information in routing table of each node:Iteration 2
At Distance to reach nodenode A B C D E F G A 0 1 1 2 1 1 2B 1 0 1 2 2 2 xC 1 1 0 1 2 2 2D 2 2 1 0 x 2 1E 1 2 2 x 0 2 xF 1 2 2 2 2 0 1G 2 x 2 1 x 1 0
D
G
A
F
E
B
C
Distance Vector Routing Example
Information in routing table of each node:Iteration 3
At Distance to reach nodenode A B C D E F G A 0 1 1 2 1 1 2B 1 0 1 2 2 2 3C 1 1 0 1 2 2 2D 2 2 1 0 3 2 1E 1 2 2 3 0 2 3F 1 2 2 2 2 0 1G 2 3 2 1 3 1 0
D
G
A
F
E
B
C
Distance Vector Routing Table
D
G
A
F
E
B
C
Destination Cost NextHop
A 1 A
C 1 C
D 2 C
E 2 A
F 2 A
G 3 A
Distance Vector Routing: Link Failure• F detects that link to G has failed• F sets distance to G to infinity and
sends update to A• A sets distance to G to infinity since
it uses F to reach G• A receives periodic update from C
with 2-hop path to G• A sets distance to G to 3 and sends
update to F• F decides it can reach G in 4 hops
via A
D
G
A
F
E
B
C
Count to Infinity Problem• Link from A to E fails• A advertises distance of infinity to E, but
B and C advertise a distance of 2 to E !• B decides it can reach E in 3 hops;
advertises this to all• A decides it can read E in 4 hops;
advertises this to all• C decides that it can reach E in 5 hops…
• We are counting to infinity …
D
G
A
F
E
B
C
Split Horizon
A B C
D
C : 1 : C
C : 2 : B
C : ∞ : -C : 2 : B
Loop of > 2 nodes fails split horizon !!!
10
Reliable Flooding
X A
C B D
(a)
X A
C B D
(b)
X A
C B D
(c)
X A
C B D
(d)
11
Route Calculation: Dijkstra’s Shortest
Path Algorithm• Let
– N denotes set of nodes in the graph– l (i, j) denotes non-negative cost (weight) for edge (i, j)– s denotes this node– M denotes the set of nodes incorporated so far– C(n) denotes cost of the path from s to node n
M = {s}for each n in N - {s}
C(n) = l(s, n) // calculate cost to each node
while (M != N)M = M union {w} such that C(w) is the minimum for
all w in (N - M)for each n in (N - M)
C(n) = MIN(C(n), C (w) + l(w, n ))
12
Link State Algorithm1. Initialize confirmed with entry for self (cost = 0)2. For newly added node (next), select its LSP3. For each neighbor of next, calculate cost to reach
neighbor as the sum of cost from self to next and from next to neighbor
1. If neighbor is currently in neither confirmed nor tentative, add <neighbor, cost, nexthop> to tentative, where nexthop is the direction to reach next
2. If neighbor is currently in tentative and cost is less than current cost for neighbor , then replace current entry with <neighbor, cost, nexthop>, where nexthop is the direction to reach next
4. If tentative is empty, stop. Otherwise pick entry from tentative with the lowest cost, move it to confirmed and return to step 2.
Route Calculation
At node D
Confirmed list Tentative list
1. (D,0,-)
2. (D,0,-) (C,2,C), (B,11,B)
3. (D,0,-), (C,2,C) (B,11,B)
4. (D,0,-), (C,2,C) (B,5,C), (A,12,C)
5. (D,0,-), (C,2,C), (B,5,C) (A,12,C)
6. (D,0,-), (C,2,C), (B,5,C) (A,10,C)
7. (D,0,-), (C,2,C), (B,5,C), (A,10,C)
D
A
B
C
5 3
211
10
© Copyright 2001-2006. Amir Qayyum. All rights reserved. 1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
© Copyright 2001-2006. Amir Qayyum. All rights reserved.
Lecture No. 22
OSPF Routing Protocol
• Authentication of routing messages– Encrypted communication between routers
• Additional hierarchy– Domains are split into areas
– Routers only need to know how to reach every node in a domain
– Routers need to know how to get to the right area
– Load balancing• Allows traffic to be distributed over multiple routes
Original ARPANET Metric• Uniform 56 kbps lines
– Bandwidth equal on every line (hence irrelevant)
– Latency relatively unimportant
• Use queue length as distance (number of packets waiting to use a link)Problems– Uniform bandwidth assumption became invalid
– Latency comparable to 1kB transmission delay on 1.544 Mbps link
How to Make Routing Scale
• Flat versus Hierarchical Addresses
• Inefficient use of hierarchical address space– Class C with 2 hosts (0.78% efficient)
– Class B with 256 hosts (0.39% efficient)
• Still Too Many Networks– Routing tables do not scale
– Route propagation protocols do not scale
Mobile IP
Internetwork
Foreign agent(12.0.0.6)
Mobile host(10.0.0.9)
Home agent(10.0.0.3)
Home network(Network 10)
Sending host
IP tunnel
(Network 12)
Mobile IP - Challenges
• How does the MH get a local IP address ?
• How does the HA intercept a packet that is destined for the MH ?
• How does the HA then deliver the packet to the FA ?
• How does the FA deliver the packet to the MH ?
1
CS716
Advanced Computer Networks
By Dr. Amir Qayyum
Lecture No. 23
Where we are - What we understand• Concepts of networking & network programming
– Elements of networks: nodes and links– Building a packet abstraction on a link
• Issues related to bits/frames sent on a single link– How to detect transmission errors in a frame after encoding
and framing it– How to simulate a reliable channel (sliding window)– How to arbitrate access to shared media in any network
• Design issues of direct link networks– Functionality of network adaptors
Where we are – What we understand• How switches may provide indirect connectivity
– Different ways to move through a network (forwarding)– Bridge approach to extending LAN concept– Example of a real virtual circuit network (ATM)– How switches are built and contention within switches
• How different networks work together– (Common connecting protocol IP) and its service model– Routing protocols in simple networks
• Next: large networks working together
Internetworking
• Reading: Peterson and Davie, Chapter 4
• Global internets (scale)
• Dealing with global scale– Virtual geography and addresses
– Hierarchical routing
– Multicast traffic
– MPLS
• Future internetworking: IPv6
How Should Hierarchy be Defined ?
• Hierarchy reduces information requirements for forwarding
Internetworking
• Main problems addressed by internetworking– Heterogeneity– Scale
• Bridges scale to level of medium organization
• IP addresses most end-to-end heterogeneity issues
• Today: issues of global scale
Global Internets
• Evolution of Internet structure
• Virtual geographies: networks, domains, and translations
• Problems of scale
• IP address hierarchy evolution
• Notion of autonomous systems (AS’s)
• Common intradomain routing protocols
• Standard interdomain routing protocol
Diameter of the Internet• Connect all nodes of a network in a big ring
– Network diameter will be half the ring circumference
• Adding more links brings down network diameter– Diameter can be quarter of the ring with one link …
How many links should be added to
bring the diameter
below some threshold ?
Internet Structure in Recent Past 1990
• Tree structure, centered around one backbone• Interconnection of thousands of “organizations”
– End users and service providers
• Scalability of routing protocols in global Internet– Network entities are administratively independent
NSFNET backboneStanford
BARRNETregional
BerkeleyPARC
NCAR
UA
UNM
Westnetregional
UNL KU
ISU
MidNetregional
Internet Structure Today• Multiple backbone service providers• Arbitrary graph structure• Different QoS in different backbones (commercial)
Backbone service provider
Peeringpoint
Peeringpoint
Large corporation
Large corporation
Smallcorporation
“Consumer” ISP
“Consumer” ISP
“Consumer” ISP
Smallcorporation
Virtual Geographies
• Internet domain names (understood by people)• IP network numbers (understood by routers)• Internet domain name hierarchy
– ASCII strings separated by periods; typical formats:
• host.domain.domain_type– xyz.carepvtltd.com
• host.subdomain.domain.domain_type– abc.cs.berkeley.edu
• host.domain.domain_type.country– www.case.edu.pk
Virtual Geographies and IP Address
• Routers use IP network numbers
• Original intention:– One network number per physical network
• Closer to real geography (than domain names)• Some jumps, e.g.,
– Divisions of one company– Organizations with network research testbeds
Class Network HostIP
Address
Virtual Geographies and DNS
• Translation from domain name to IP address– Domain to IP: Domain Name Service (DNS)
– IP to domain: relies on host
• Not secure
• Use reverse lookup(domain to IP) for security
• DNS is a hierarchical, (mostly) distributedapplication (more detail later)
Scaling Problems for the Internet• Inefficient address allocation• Too many networks for routing
• Can trade off between these two
• Questions– What network(s) should you allocate to a
company with 1000 machines ?
– What about a company with 200 machines ?
– What about a company with 2 machines that plans to grow rapidly ?
Scaling Problems for the Internet• Pressure primarily on class B networks
– Most companies plan to grow beyond 255 machines– Renumbering is a hassle and can interrupt service– Only around 16,000 class B networks available (14 bit)
• Class B networks aren’t very efficient– Few organizations have O(10,000) machines– More likely a network uses O(1,000) of 65,000 address
• Scaling problems with alternatives– Multiple table entries if class C networks used instead– Protocols do not scale beyond O(10,000) networks
IP Address Hierarchy Evolution• Begin with class-based system
• Subnetting within an organization– Network can be broken into smaller networks– Recognized only within the organization– Implemented by packet-switching– Smaller networks called subnets
Class A: 0 network (7) host (24 bits)
Class B: 1 network (14 bits) host (16 bits)0
Class C: 1 network (21 bits) host (8 bits)1 0
Subnetting
• Another level to address/routing hierarchy: subnet• Subnet masks define variable partition of host part• Subnets visible only within site (close to each other)
Class Network HostIP address
Network HostSubnetted IP address Subnet
1111111111111111 00000000Subnet Mask 11111111
Class
11
1111111111111111 0000Non-
contiguous Mask
111111 11110000
Forwarding table at router R1Subnet Number Subnet Mask Next Hop
128.96.34.0 255.255.255.128 interface 0
128.96.34.128 255.255.255.128 interface 1
128.96.33.0 255.255.255.0 R2
Subnetting Example
All hosts have
address && mask = subnet address
Subnet mask: 255.255.255.128Subnet number: 128.96.34.0
128.96.34.15128.96.34.1
H1R1
128.96.34.130Subnet mask: 255.255.255.128Subnet number: 128.96.34.128
128.96.34.129128.96.34.139
R2H2
128.96.33.1128.96.33.14
Subnet mask: 255.255.255.0Subnet number: 128.96.33.0
H3
Forwarding AlgorithmD = destination IP addressfor each entry (SubnetNum, SubnetMask, NextHop)
D1 = SubnetMask & Dif D1 = SubnetNum
if NextHop is an interfacedeliver datagram directly to D
elsedeliver datagram to NextHop
• Comparisons in parallel, taking the longest match• Use a default router if nothing matches• Not necessary for 1s in subnet mask to be contiguous • Can put multiple subnets on one physical network
Subnetting
• Subnets allow hierarchy within organizations
• But do not reduce class B address pressure