INFO 515Lecture #91 Action Research More Crosstab Measures INFO 515 Glenn Booker.
INFO 203Week #61 IT For Engineers Application and Transport Layers INFO 203 Dr. Jennifer Booker.
-
Upload
darren-greene -
Category
Documents
-
view
214 -
download
0
description
Transcript of INFO 203Week #61 IT For Engineers Application and Transport Layers INFO 203 Dr. Jennifer Booker.
INFO 203 Week #6 1
IT For EngineersApplication and Transport Layers
INFO 203Dr. Jennifer Booker
Application Layer The Application Layer is the reason the rest of
the network exists – to serve applications Most of the software familiar to end users are
applications Email, FTP, newsgroups, chat, the Web, streaming
video, video conferencing, IPTV, etc. We focus first on key concepts related to the
Application Layer, then discuss some specific applications briefly
Week #6 2INFO 203
Application Layer New applications designed for network
implementation need to decide whether the application is based on Client-server architecture Peer to peer (P2P) Or some hybrid combination of the two
Week #6 3INFO 203
Client-server Architecture In client-server architecture, the server
Handles requests from many clients, and Is generally always available Often has a fixed IP address
Clients generally don’t communicate with each other, and may be on or off independently of each other and the server Client-server applications include email, FTP,
the Web, remote login
Week #6 4INFO 203
P2P Architecture P2P architecture assumes the clients are on
or off at will, and all are treated equally as potential servers and/or clients Apps include BitTorrent, Skype, and IPTV Client-server and P2P combinations exist, called
a Hybrid Architecture
Week #6 5INFO 203
Process Communication Any network application (no matter which
architecture) needs to communicate between hosts using processes In this sense, a process is a program running on a
client, server, or peer host Processes may communicate with other
processes on the same host; this is controlled by the host’s operating system (OS)
We are interested in processes that communicate between hosts
Week #6 6INFO 203
Process Communication Processes exchange messages
The sending or client process creates a message and sends it into the network
The receiving or server process gets the message from the network and might reply
Notice that client and server process only relate to their relative roles in sending a message, not the client-server or other architectures mentioned earlier
Week #6 7INFO 203
Addressing Processes For the server process to get the message, it
has to be addressed correctly The host address and receiving process are
the key parts of the address The host address is its IP address (the 32-
or 128-bit address of the host’s network interface) The receiving process is identified by its
port number, since many processes can be running at once
Week #6 8INFO 203
Addressing Processes
Week #6 9
Client processServer process
IP address
Socket Port
InternetTCP or UDP and lower
Layers
TCP or UDP and lower
Layers
Sockets send packets Ports listen for them
INFO 203
Port Number Port numbers follow default values, set by
the IANA, unless specified otherwise 21 = FTP 23 = Telnet 25 = SMTP 53 = DNS 80 = HTTP, http://mine.com implies http://mine.com:80 110 = POP3 194 = IRC, and hundreds more
Week #6 10INFO 203
More Protocols Application-layer protocols define how a
particular application’s processes are structured What types of messages are allowed The syntax of those messages The meaning of the fields in the syntax Rules for processing messages – when and
how to send messages, how to reply, etc.
Week #6 11INFO 203
Application vs its protocols A single application often needs to use
several application-layer protocols A web browser might use HTTP, but also FTP,
telnet, gopher, etc. An email application might use POP3, SMTP,
IMAP, etc. Many app protocols are defined in RFCs
But many application-layer protocols are proprietary
Week #6 12INFO 203
RFC Summary The “Internet Official Protocol Standards”
RFC used to identify the current standards (STD) for every protocol As a result of RFC 7100, that information is on a
website http://www.rfc-editor.org/search/standards.php For example, STD 9 is the standard for FTP
Week #6 13INFO 203
Application Services The transport layer connects the application
layer to everything else Have a choice of two protocols, TCP and
UDP, unless you want to write your own! Key services include
Reliable data transfer – how important is it? Or is your app loss-tolerant?
Week #6 14INFO 203
Application Services How much bandwidth or throughput does your
app need? Does sending rate have to equal receiving rate? Some apps are elastic – can tolerate wide
ranges of available bandwidth How sensitive is your app to timing?
Games and telephony tend to be sensitive to slow or erratic transmission delays
How important is security?
Week #6 15INFO 203
TCP Services TCP provides a connection-oriented service,
where the sockets of the client and server recognize a connection for the duration of the session Connection is duplex – messages can go both
ways at once TCP is highly reliable – the bits leaving one side
all get to the other side, and get put back in the original order
Week #6 16INFO 203
TCP Services TCP also provides congestion control, for benefit
of the Internet This throttles the sending processes when the
connection is congested, and can limit bandwidth TCP does not guarantee any level of
transmission rate, or provide delay guarantees So you’ll get your data across, but we
don’t know when
Week #6 17INFO 203
UDP Services UDP is a lightweight protocol – meaning it
doesn’t do much! UDP is connectionless UDP is unreliable – data may never get there UDP packets may arrive out of order and not
realize it There are no transmission rate guarantees
Week #6 18INFO 203
Services NOT Provided TCP and UDP do not provide guarantees of
throughput or timing TCP does nothing for security per se, but
SSL can be added on See Chapter 7 in INFO 331
Week #6 19INFO 203
Application Protocols We’ll examine protocols for Internet-based
applications HTTP FTP SMTP POP3 IMAP DNS
Week #6 20INFO 203
HTTP The HyperText Transfer Protocol (HTTP)
is the heart of the Web Defined by RFCs 1945 (v1.0) and 2616 (v1.1) Has client and server programs which
communicate via HTTP messages Web pages contain objects – files of various
sorts, such as a base HTML file, which cites JPG and/or GIF images, etc.
App to use HTTP is a browser
Week #6 21INFO 203
HTTP A Web server houses the objects
Apache and Microsoft Internet Information Services (IIS) are common Web server apps
HTTP defines the messages that pass between client and server Uses TCP for transport protocol HTTP has no memory of previous actions (a
stateless protocol) – so if you ask for a file 126 times, it will send the file 126 times
Week #6 22INFO 203
HTTP vs HTML Don’t confuse HTTP with HTML
HTTP is the protocol used to define how files are requested and transferred between server and clients
HTML is the format of web pages So an HTML file might be the structure of
an entity body transferred using HTTP
Week #6 23INFO 203
HTTP Messages HTTP messages are two types, request
messages (from client) and response messages (from server) All HTTP messages are plain ASCII text
‘Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body.’ [RFC 2616, para 4.1]
CRLF is a “carriage return and line feed”
Week #6 24INFO 203
HTTP Messages There are many headers which could appear
in requests or responses Cache-Control, Connection, Date, Pragma,
Trailer, Transfer-Encoding, Upgrade, Via, and/or Warning [RFC 2616, para 4.5]
Disclaimer: RFC 2616 is 176 pages long – so we’re just providing a summary!
Week #6 25INFO 203
HTTP Requests Request messages have variable number
of lines, depending on the method called General request syntax is
Method Request-URI HTTP-Version Methods are OPTIONS, GET, HEAD, POST,
PUT, DELETE, TRACE, or CONNECT [RFC 2616, para 5.1.1] Most commonly used is GET
Request-URI is the desired Uniform Resource Identifier (URI, commonly called a URL)
Week #6 26INFO 203
HTTP Requests HTTP-Version is what it sounds like, e.g.
HTTP/1.1 There are many possible request headers
Accept, Accept-Charset, Accept-Encoding, Accept-Language, Authorization, Expect, From, Host, If-Match, If-Modified-Since, If-None-Match, If-Range, If-Unmodified-Since, Max-Forwards, Proxy-Authorization, Range, Referer, TE (extension transfer-codings), and/or User-Agent [RFC 2616, para 5.3]
Week #6 27INFO 203
HTTP Responses HTTP responses go from server to client General syntax starts with
HTTP-Version Status-Code Reason-Phrase[RFC 2616, para 6.1]
The Status-Code could be dozens of values "200" OK "403" Forbidden "404" Not Found
The Reason-Phrase is any text phrase assigned
Week #6 28INFO 203
HTTP Responses Response headers can include
Accept-Ranges, Age, ETag, Location, Proxy-Authenticate, Retry-After, Server, Vary, and/or WWW-Authenticate [RFC 2616, para 6.2]
Responses usually include entities, unless the HEAD method was used
Week #6 29INFO 203
HTTP Entities An entity is the object sent or returned with an
HTTP message Entities can be with requests or responses
Entity headers include Allow, Content-Encoding, Content-Language, Content-Length (bytes), Content-Location, Content-MD5, Content-Range, Content-Type, Expires, Last-Modified, and/or extension-header [RFC 2616, para 7.1] Where extension-header is any allowable
message-header for that kind of message
Week #6 30INFO 203
HTTP So HTTP describes request and response
message formats Both types typically have a first line which
tells its purpose (the request or status line) There can be many header lines There might be an entity attached
Week #6 31INFO 203
FTP The File Transfer Protocol is one of the oldest
Internet applications (now RFC 959, but started as RFC 114 in 1971)
While HTTP and FTP both send files FTP uses two connections – one for control, one for
data (control information is out-of-band) User login and commands are on the control connection, files
move on the data connection HTTP uses one connection for both purposes (control
information is in-band)
Week #6 32INFO 203
FTP FTP uses TCP, and usually connects to the
server on ports 20 and 21 The client sends user ID and password
FTP may be done to some sites with generic ID, known as anonymous FTP
Once logged in, the user may navigate and view directories, and upload (STOR or PUT) or download (RETR or GET) files
Week #6 33INFO 203
Electronic Mail E-mail is another ancient Internet application,
with origins in RFC 772 in 1980 It provides asynchronous text communication
and allows files to be attached to messages Even voice and video messages
Main elements are users (sender and recipient), mail servers, and the Simple Mail Transfer Protocol (SMTP, RFC 5321) Careful, there’s also an SNTP for network time
Week #6 34INFO 203
Electronic Mail Email is composed in a client, which sends it to
a mail queue in the sender’s mail server The sending mail server uses SMTP to send the
message to the recipient’s mail server If mail can’t be sent successfully, the sender’s mail
server will put the message in a queue, and keep trying (typically for 3 days)
The recipient is notified that the message is present, which they read with their client
Week #6 35INFO 203
Electronic Mail Each user has a mailbox on the mail server
Access to the mailbox is controlled with user name and password
SMTP is the main protocol to get email from one mail server to another It uses TCP, not surprisingly Defined in draft standard RFC 5321 Only uses 7-bit ASCII for message AND body
Forces binary files to be converted to ASCII & back
Week #6 36INFO 203
Mail Message Formats Email contains header information defined
by RFC 822, now RFC 5322 “Internet Message Format” The sender headers can include: FROM,
SENDER, REPLY-TO, RESENT-FROM, RESENT-SENDER, and RESENT-REPLY-TO
Receiver headers can be: TO, CC, and BCC Reference headers can be: MESSAGE-ID, IN-
REPLY-TO, REFERENCES and KEYWORDS
Week #6 37INFO 203
MIME Multipurpose Internet Mail Extensions (MIME)
are used for handling non-ASCII contents in email, e.g. non-Latin character sets, binary files, images, audio, video, etc.
MIME (RFC 2045) adds the ability to handle (1) textual message bodies in character sets other
than US-ASCII, (2) an extensible set of different formats for non-textual message bodies, (3) multi-part message bodies, and (4) textual header information in character sets other than US-ASCII.
Week #6 38INFO 203
MIME The received message also includes a Received: header added to the top of the message
This is familiar in email if you look at the full headers
Week #6 39INFO 203
Mail Access Protocols If you log directly into your email server,
SMTP is all you need to handle email But if you wish to access email from a local
host, you need to use a mail access protocol The biggies at present are
Post Office Protocol version 3 (POP3) and Internet Mail Access Protocol (IMAP)
Week #6 40INFO 203
POP3 POP3 is defined in RFC 1939
It’s a pretty simple protocol compared to many SMTP sends mail between mail servers,
and from the user agent (email app) to their mail server
POP3 transfers mail from your mail server to your user agent
From a user’s view, SMTP handles outgoing email, and POP3 handles incoming email
Week #6 41INFO 203
IMAP IMAP, defined in RFC 3501, allows folders to
be defined on the mail server to organize email there Messages are associated with a folder – first the
generic INBOX, then moved by the user Hence state information about the folder for each
message must be saved across sessions IMAP also provides search capability
within the mailbox
Week #6 42INFO 203
DNS A key need, once the Internet grew beyond a
few thousand hosts, was to automate converting human* readable addresses or hostnames (www.microsoft.com) to IP addresses (207.46.198.60) got IP here
That is the purpose of the Domain Name System (DNS) Before DNS, really big lookup tables were used!
Week #6 43
* Humans who read English, at least!INFO 203
Host vs Domain Names A hostname is the name of a particular host
computer, such as banner.drexel.edu May really represent multiple computers, but logically
they are all the same host A domain name is the top level domain and the
specific domain name, like drexel.edu Top level domains are com, edu, gov, mil, org,
net, etc. and the country codes uk, de, fr, etc.
Week #6 44INFO 203
IP Addresses IP addresses have four groups of bytes, each
group from 0 to 255, separated by periods Why called bytes? Each value from 0 to 255
corresponds to a value of from 0 to (28-1), and a byte is eight bits
IP addresses are typically static (fixed) for servers and other semi-permanent Internet connections, and dynamic for temporary connections (e.g. dial-up, wireless)
Week #6 45INFO 203
DNS DNS runs over UDP, port 53 (something uses UDP!)
DNS is managed by DNS servers, typically running Berkeley Internet Name Domain (BIND) software
DNS is used by other applications (HTTP, SMTP, FTP) to translate host names to IP addresses You can also do a reverse DNS lookup (convert
205.188.97.2 to www-vd03.evip.aol.com)
Week #6 46INFO 203
DNS DNS also provides other key services
Host aliasing allows the true or canonical hostname to have aliases When blah.com works to get to www.blah.com, it’s
because blah.com is a host alias of www.blah.com Mail server aliasing – same concept, but for
mail server names Load distribution across many servers for the
same hostname – so everyone in the world doesn’t use one IP address for microsoft.com
Week #6 47INFO 203
DNS Lookup This would be terribly tedious without caching
Common queries are stored on each level of DNS server, so they don’t have to be looked up constantly
Cached values are cleared typically every two days or less, in case the data changes
Week #6 48INFO 203
nslookup The command nslookup provides basic IP
data for a hostname or domain Nslookup snip.net
Server: ns2.snip.net Address: 209.204.64.3 Name: snip.net Address: 216.83.103.123
A registrar makes changes to the DNS database The list of registrars is at http://www.internic.net/
Week #6 49INFO 203
Transport Layer The Transport Layer handles logical
communication between processes It’s the last layer not used between processes for
routing, so it’s the last thing a client process and the first thing a server process sees of a packet
By logical communication, we recognize that the means used to get between processes, and the distance covered, are irrelevant
Week #6 50INFO 203
Transport vs Network Notice we didn’t say ‘hosts’ in the previous
slide…that’s because The network layer provides logical communication
between hosts
Week #6 51INFO 203
Two Choices Here we choose between TCP and UDP
In the transport layer, a packet is a segment In the network layer, a packet is a datagram
The network layer is home to the Internet Protocol (IP) IP provides logical communication between hosts IP makes a “best effort” to get segments where they
belong – no guarantees of delivery, or delivery sequence, or delivery integrity
Week #6 52INFO 203
IP Each host has an IP address Common purpose of UDP and TCP is extend
delivery of IP data to the host’s processes This is called transport-layer multiplexing and
demultiplexing Both UDP and TCP also provide error checking
That’s it for UDP – data delivery and error checking!
Week #6 53INFO 203
TCP TCP also provides reliable data transfer (not
just data delivery) Uses flow control, sequence numbers,
acknowledgements, and timers to ensure data is delivered correctly and in order
TCP also provides congestion control TCP applications share the available bandwidth
(they watched Sesame Street!) UDP takes whatever it can get (greedy little protocol)
Week #6 54INFO 203
Segment Header Hence the segment header starts with the
source and destination port numbers Each port number is a 16-bit (2 byte) value
(0 to 65,535) Well known port numbers are from 0 to 1023 (210 -
1) After the port numbers are other headers,
specific to TCP or UDP, then the message
Week #6 55INFO 203
UDP The most minimal transport layer has to
do multiplexing and demultiplexing UDP does this and a little error checking
and, well, um, that’s about it! UDP was defined in RFC 768 An app that uses UDP almost talks directly to IP Adds only two small data fields to the header, after the
requisite source/destination addresses There’s no handshaking; UDP is connectionless
Week #6 56INFO 203
UDP for DNS DNS uses UDP A DNS query is packaged into a segment,
and is passed to the network layer The DNS app waits for a response; if it doesn’t
get one soon enough (times out), it tries another server or reports no reply
Hence the app must allow for the unreliability of UDP, by planning what to do if no response comes back
Week #6 57INFO 203
UDP Advantages Still UDP is good when:
You want the app to have detailed control over what is sent across the network; UDP changes it little
No connection establishment delay No connection state data in the end hosts; hence a
server can support more UDP clients than TCP Small packet header overhead per segment
TCP uses 20 bytes of header data, UDP only 8 bytes
Week #6 58INFO 203
UDP Apps Other than DNS, UDP is also used for
Network management (SNMP) Routing (RIP) Multimedia & telephony (proprietary protocols) Remote file server (NFS)
The lack of congestion control in UDP can be a problem when lost of large UDP messages are being sent – can crowd out TCP apps
Week #6 59INFO 203
Checksum Noise in the transmission lines can lose
bits of data or rearrange them in transit Checksums are a common method to
detect errors (RFC 1071) To create a checksum:
Find the sum of the binary digits of the message The checksum is the 1s (ones) complement of
the sum If message is uncorrupted, sum of message plus
checksum is all ones 1111111111111…Week #6 60INFO 203
1s Complement? The 1s complement is a mirror image of a
binary number – change all the zeros to ones, and ones to zeros So the 1s complement of 00101110101 is
11010001010 UDP does error checking because not all
lower layer protocols do error checking This provides end-to-end error checking, since it’s
more efficient than every step along the way
Week #6 61INFO 203
Reliable Data Transfer Mechanisms
Checksum, to detect bit errors in a packet Timer, to know when a packet or its ACK was lost Sequence number, to detect lost or duplicate packets Acknowledgement, to know packet got to receiver
correctly Negative acknowledgement, to tell packet was
corrupted but received Window, to pipeline many packets at once before an
ACK was received for any of them
Week #6 62INFO 203
TCP Intro Now see how all this applies to TCP
First in RFC 793, now RFC 2581 Invented circa 1974 by Vint Cerf and Robert Kahn
TCP starts with a handshake protocol, which defines many connection variables Connection only at hosts, not in between Routers are oblivious to whether TCP is used!
TCP is a full duplex service – data can flow both directions at once, and is connection-oriented
Week #6 63INFO 203
TCP Segment Structure A TCP segment consists of header fields
and a data field The data field size is limited by the MSS
Typical header size is 20 bytes The header is 32 bits wide (4 bytes), so it has
five lines at a minimum
Week #6 64INFO 203
TCP Header Structure The header lines are
Source and destination port numbers (16 bit ea.) Sequence number (32 bit) ACK number (32 bit) A bunch of little stuff (header length, URG, ACK, PSH,
RST, SYN, and FIN bits), then the receive window (16 bit)
Internet checksum, urgent data pointer (16 bit ea.) And possibly several options
Week #6 65INFO 203
TCP Segment Structure We’ve seen the port numbers (16 bits each) Sequence and ACK numbers (32 bits each)
keep track of pieces of a file The ‘bunch of little stuff’ includes
Header length (4 bits) A flag field includes six one-bit fields: ACK, RST, SYN,
FIN, PSH, and URG The URG bit marks urgent data later on that line
The receive window is used for flow control
Week #6 66INFO 203
TCP Segment Structure The checksum is used for bit error detection,
as with UDP The urgent data pointer tells where the urgent
data is located The options include negotiating the MSS,
scaling the window size, or time stamping
Week #6 67INFO 203
Telnet Example Telnet (RFC 854) is an old app for remote
login via TCP Telnet interactively echoes whatever was
typed to show it got to the other side
Week #6 68INFO 203
Timeout Calculation We want the timeout interval larger than
EstimatedRTT, but not huge; use TimeoutInterval = EstimatedRTT + 4*DevRTT EstimatedRTT is a running average RTT DevRTT is a running standard deviation for RTT
Timeout interval is constantly being calculated, with frequent measurement of SampleRTT to find current values for: Estimated RTT, DevRTT, & TimeoutInterval
Week #6 69INFO 203
Flow Control TCP connection hosts maintain a receive
buffer, for bytes received correctly and in order Apps might not read from the buffer for a while, so
it can overflow Flow control focuses on preventing overflow
of the receive buffer So it also depends on how fast the receiving
app is reading the data!
Week #6 70INFO 203
Flow Control Hence the sender in TCP maintains a receive
window (RcvWindow) variable – how much room is left in the receive buffer The amount of room in RcvWindow is returned
to the sender in the receive window field of every segment
If the RcvWindow goes to zero, the sender can’t send more data to the receiver ever!
To prevent this, TCP makes the sender transmit one byte messages when RcvWindow is zero,
Week #6 71INFO 203
UDP Flow Control There ain’t none (sic!) UDP adds newly arrived segments to a buffer
in front of the receiving socket If the buffer gets full, segments are dropped Bye-bye data!
Week #6 72INFO 203
Congestion Control Now address congestion control issues
Congestion is a traffic jam in the middle of the network somewhere
Most common cause is too many sources sending data too fast into the network
Key lessons are: A congested network forces retransmissions for
packets lost due to buffer overflow, which adds to the congestion
Week #6 73INFO 203
Congestion Control And:
A congested network can waste its bandwidth by sending duplicate packets which weren’t lost in the first place
Dropping a packet wastes the transmission capacity of every upstream link that packet saw
If loss and transmission delay are small, CongWin bytes of data can be sent every RTT, for a send rate of CongWin/RTT
Week #6 74INFO 203
Fairness Unequal connections are less fair
Lower RTT gets more bandwidth (CongWin increases faster)
UDP traffic can force out the more polite TCP traffic
Multiple TCP connections from a single host (e.g. from downloading many parts of a Web page at once) get more bandwidth
Week #6 75INFO 203
Are We Done Yet? So we’ve covered transport layer protocols
from the terribly simple UDP to a seemingly exhaustive study of TCP Key features along the way include
multiplexing/demultiplexing, error detection, acknowledgements, timers, retransmissions, sequence numbers, connection management, flow control, end-to-end congestion control
So much for the “edge” of the Internet; next is the network layer, to start looking at the core
Week #6 76INFO 203