Tektronix: Applications > e*Scope™ Remote Control Puts Network ...
7. TCP - :: Network Convergence and Security Laboratory...
Transcript of 7. TCP - :: Network Convergence and Security Laboratory...
2009 Yanghee Choi 2
TCP Basics
Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service
2009 Yanghee Choi 3
TCP Mechanisms
Addressing: application to application addressing Reliable delivery: the receiver application should receive the
same data stream the source puts on the network Segment order maintenance: data segments should reach
the application in the same order they left the sender Flow control: the data sending speed should adapt itself to the
receiver’s speed Congestion control: the transmission speed can not be
faster than the speed of the slowest link traversed on the connection path
Segmentation: data is sent in segments that provide the highest throughput
2009 Yanghee Choi 4
Reliable Transmission
Sender ReceiverNetwork Message
Send Packet 1
Send Packet 2Receive ACK 1
Receive ACK 2
Send ACK 1
Receive Packet 1
Send ACK 2Receive Packet 2
2009 Yanghee Choi 5
Receive ACK 1
Sender ReceiverNetwork Message
Send Packet 1
Send ACK 1Receive Packet 1
Packet lost
Timer Expires
Start Timer
Retransmit Packet 1
Timeout and Retransmission
Start Timer
Cancel Timer
2009 Yanghee Choi 6
Adaptive Retransmission
TimeoutPacket lost
estimation 1
estimation 2Timeout
Packet lost
estimation 1
estimation 2
2009 Yanghee Choi 7
Sliding Window
1 2 3 4 5 6 7 8 9 10
Window slides
1 2 3 4 5 6 7 8 9 10
initial window
. . .
. . .
Sender ReceiverNetwork Message
Send Packet 1
Receive ACK 2
Send ACK 1
Receive packet 1Send Packet 2
Send Packet 3
Receive ACK 3
Receive ACK 1
Send ACK 2Receive Packet 2
Send ACK 3
Receive Packet 3
2009 Yanghee Choi 8
Transmission Control Protocol
TCP is connection oriented and full duplex The maximum segment size(MSS) is set during
connection establishment Reliability is achieved using acknowledgments, round
trip delay estimations and data retransmission TCP uses a variable window mechanism for flow
control Congestion control and avoidance is reached using
slow start and congestion avoidance schemes
2009 Yanghee Choi 9
Ports, Connections, and Endpoints
Conceptual layering of UDP and TCP
Connections: are identified by a pair of endpoints
• e.g., (147.46.114.112, 21) and (147.46.114.128, 1500)
TCP uses the connection, not the protocol port, as its fundamental abstraction
Because TCP identifies a connection by a pair of endpoints, a given TCP port number can be shared by multiple connections on the same machine
Application can provide concurrent service to multiple connections simultaneously without needing unique local port for each connection
Application
Reliable Stream (TCP)User Datagram(UDP)
Internet (IP)
Network Interface
• Ports• Endpoints: (host, port)
–e.g., (147.46.114.112, 21)
2009 Yanghee Choi 10
Flow Control in TCP
TCP views the data stream as a sequence of octets that it divides into segments for transmission
TCP uses a sliding window mechanism to adjust the sender’s transmission speed to that of the receiver
The sliding window permits the sending of multiple segments before waiting for an ACK -> efficient transmission
ACK segments indicate the last correctly received byte and the number of bytes the receiver is still willing to accept
A sender keeps three pointers associated with every connection
1 2 3 4 5 6 7 8 9 10 11 . . .
current window
2009 Yanghee Choi 11
Flow Control in TCP (cont.)
TCP allows the window size to vary over time ACK contains a window advertisement that specifies how many
additional octets of data the receiver is prepared to accept (receiver’s buffer size)
In response to an increased(decreased) window advertisement, the sender increases(decreases) the size of its sliding window
Variable size window provides flow control as well as reliable transfer Flow control mechanism is essential in Internet environment, where
machines of various speeds and sizes communicate through networks and routers of various speed and capacities
• End-to-end flow control: sliding window scheme• Congestion control: no explicit mechanism, implementation dependent
2009 Yanghee Choi 12
TCP Segment Format
0 4 10 16 24 31
SOURCE PORT DESTINATION PORT
SEQUENCE NUMBER
ACKNOWLEDGEMENT NUMBER
HLEN RESERVED CODE BITS WINDOW
CHECKSUM URGENT POINTER
OPTIONS (IF ANY) PADDING
DATA
. . .
2009 Yanghee Choi 13
TCP Segment Format (cont.) Segments are exchanged to
• establish connections• transfer data• send ACK• advertise window• close connections
CODE BITS: determines the purpose and contents of the segmentBit(left to right) Meaning if bit set to 1URG Urgent pointer field is validACK Acknowledgement field is validPSH This segment requests a pushRST Reset the connectionSYN Synchronize sequence numbersFIN Sender has reached end of its byte stream
2009 Yanghee Choi 14
Out of Band Data
It is important for the program at one end of a connection to send data out of band, without for the program at the other end of the connection to consume octets already in the stream• e.g., In a remote login session, interrupt or abort keyboard
sequence TCP allows the sender to specify data as urgent, meaning that
the receiving program should be notified of its arrival as quickly as possible, regardless of its position in the stream
Urgent mode vs. normal mode When the URG code bit is set, the Urgent Pointer specifies the
position in the segment where urgent data ends
2009 Yanghee Choi 15
Maximum Segment Size(MSS) Option
Most common option in TCP segment To support heterogeneous buffer capacities To make good use of the bandwidth in high speed LAN.
• MSS == minimum MTU In general internet environment, choosing a good MSS can be difficult
because performance can be poor for either extremely large segment sizes or extremely small sizes
• Extremely small MSS: makes network utilization low• Extremely large MSS: decreases throughput because of fragmentation
Optimum MSS occurs when the IP datagrams carrying the segments are as large as possible without requiring fragmentation anywhere along the path from the source to the destination. => But, difficult problem for several reasons
Default MSS(536 bytes) = default size of IP datagram(576 bytes) - 40
2009 Yanghee Choi 16
TCP Checksum Computation
16-bit integer checksum used to verify the integrity of the data as well as the TCP header
TCP prepends a pseudo header to the segment, appends enough zero bits to make the segment a multiple of 16 bits, and computes the 16-bit checksum over the entire result
TCP does not count the pseudo header or padding in the segment length, nor does it transmit them
Pseudo header allows the receiver to verify that the segment has reached its correct destination
At the receiver, the IP must pass to TCP the source and destination IP addresses from the datagram as well as the segment itself
Pseudo header0 8 16 31
SOURCE IP ADDRESS
DESTINATION IP ADDRESS
ZERO PROTOCOL TCP LENGTH
2009 Yanghee Choi 17
Acknowledgments, Timeout, and Retransmission
A TCP receiver always acknowledges the last correctly received byte -> cumulative ACK
After sending a segment the sender starts a timer If the timer expires before receiving an ACK for the sent
segment, the segment is considered lost and must be retransmitted
In an internet environment, it is impossible to know a priori how quickly ACKs will return to the source
The timeout value is calculated dynamically according to the measured round trip time(RTT) - adaptive retransmission algorithm• Estimated round trip time (RTT)
• Timeout valueRTT Old RTT New Round Trip Sample= + − ≤ <( * _ ) (( )* _ _ _ ),α α α1 0 1
Timeout RTT= ≥β β* , 1
2009 Yanghee Choi 18
Round Trip Time Measurement
Acknowledgment ambiguity• Because both datagrams carry exactly the same data, the
sender has no way of knowing whether an ACK corresponds to the original or retransmitted datagram.
• The original transmission and the most recent transmission both fail to provide accurate round trip time
t1 t2 t3timeout
retransmit
ACK
Round_Trip_Sample = t3 - t2 or t3 - t1 ?
2009 Yanghee Choi 19
Modified Algorithm for RTT
Karn’s Algorithm• When computing the round trip estimate, ignore samples
that correspond to retransmitted segments, but use a backoff strategy, and retain the timeout value from a retransmitted packet for subsequent packets until a valid sample is obtained
• Timer backoff strategy: If the timer expires and causes a retransmission, TCP increases the timeout
• When an internet misbehaves, Karn’s algorithm separates computation of the timeout value from the current round trip estimate
new timeout timeout typically_ * , ,= =γ γ 2
2009 Yanghee Choi 20
Responding to High Delay Variance
To adapt to a wide range of variation in delay. Queueing theory suggests that the variation in RTT, , varies
proportional to 1/(1-L), where L is the current network load. The 1989 spec for TCP requires implementations to estimate both the
average round trip time and the variance, and to use the estimated variance in place of the constant
DEV: the estimated mean deviation: controls how quickly the new sample affects the weighted
average: controls how quickly the new sample affects the mean deviation: controls how much the deviation affects the round trip timeout
σ0 1≤ ≤L
β
DIFF SAMPLE Old RTTSmoothed RTT Old RTT DIFFDEV Old DEV DIFF Old DEVTimeout Smoothed RTT DEV
= −= +
= + −
= +
__ _ *
_ ( _ )_ *
δρ
η
ρ
η
δ
2009 Yanghee Choi 21
Connection Establishment Three way handshake
A sends a SYN segment with an initial sequence number(ISN) and the maximum segment size(MSS) it is willing to receive
B replies with a SYN segment acknowledging ISN and announcing its MSS
MSS can be at most as large as the interface segment size minus 40
Event At Site 1 Event At Site 2Network MessageSend SYN seq=x
Receive SYN segment
Receive SYN + ACK segment
Send ACK y+1
Send SYNseq=y, ACK x+1
Receive ACK segment
2009 Yanghee Choi 22
Connection Termination Three way handshake
A sender terminates its part of the connection by sending a FIN segment
After acknowledging the FIN the receiver can still send data on its part of the connection(half close)
A connection can be aborted with RST segment if the abnormal conditions arise
Event At Site 1
Send FIN seq=x
Send ACK y+1
Receive ACK segment
Receive FIN + ACK segment
Event At Site 2Network Message
Receive FIN segmentSend ACK x+1
Receive ACK segment
Send FIN seq=y, ACK x+1
(application closes connection)
(inform application)
(application closes connection)
2009 Yanghee Choi 23
TCP State Machine
CLOSED
LISTEN
SYNRECVD
SYNSENT
ESTAB-LISHED
FINWAIT-1
FINWAIT-2
CLOSING
TIMEDWAIT
CLOSEWAIT
LASTACK
anything / reset
begin
passive openclose
active open / syn
syn / syn +ack
reset
send / syn
syn / syn +ack
close /timeout /reset
ack /
ack syn+ack / ack
close / fin fin / ack
close / fin
fin / ack
fin-ack / ackack /
fin / acktimeout after 2 segment lifetimes
close / fin
ack /
2009 Yanghee Choi 24
TCP State Machine (cont.)
TCP states• CLOSED No connection is active or pending• LISTEN The server is waiting for an incoming call• SYN RCVD A connection request has arrived; wait for ACK• SYN SENT The application has started to open a conn• ESTABLISHED The normal data transfer state• FIN WAIT-1 The application has said it is finished• FIND WAIT-2 The other side has agreed to release• TIMED WAIT Wait for all packets to die off• CLOSING Both sides has tried to close simultaneously• CLOSE WAIT The other side has initiated a release• LAST ACK Wait for all packets to die off
2009 Yanghee Choi 25
Reserved TCP Port NumbersKeyword UNIX keyword Description
0 Reserved1 TCPMUX - TCP Multiplexer5 RJE - Remote Job Entry7 ECHO echo Echo9 DISCARD discard Discard11 USERS systat Active Users13 DAYTIME daytime Daytime15 - netstat Network status program17 QUOTE qotd Quote of the day19 CHARGEN chargen Character Generator20 FTP-DATA ftp-data File Transfer Protocol21 FTP ftp File Transfer Protocol23 TELNET telnet Terminal Connection25 SMTP smtp Simple Mail Transport Protocol37 TIME time Time42 NAMESERVER name Host Name Server43 NICNAME whois Who Is53 DOMAIN nameserver Domain Name Server77 - rje any private RJE service79 FINGER finger Finger93 DCP - Device Control Protocol95 SUPDUP supdup SUPDUP Protocol