http://condor.depaul.edu/jkristof/technotes/tcp.html
The Transmission Control Protocol
Abstract
It is important to understand TCP if one is to understand the historic, current and future architecture of the Internet protocols. Most applications on the Internet make use of TCP, relying upon its mechanisms that ensure safe delivery of data across an unreliable IP layer below. In this paper we explore the fundamental concepts behind TCP and how it is used to transport data between two endpoints.
1. Introduction
The Transmission Control Protocol (TCP) standard is defined in the Request For Comment (RFC) standards document number 793 [10] by the Internet Engineering Task Force (IETF). The original specification written in 1981 was based on earlier research and experimentation in the original ARPANET. The design of TCP was heavily influenced by what has come to be known as the "end-to-end argument" [3].
As it applies to the Internet, the end-to-end argument says that by putting excessive intelligence in physical and link layers to handle error control, encryption or flow control you unnecessarily complicate the system. This is because these functions will usually need to be done at the endpoints anyway, so why duplicate the effort along the way? The result of an end-to-end network then, is to provide minimal functionality on a hop-by-hop basis and maximal control between end-to-end communicating systems.
The end-to-end argument helped determine how two characteristics of TCP operate; performance and error handling. TCP performance is often dependent on a subset of algorithms and techniques such as flow control and congestion control. Flow control determines the rate at which data is transmitted between a sender and receiver. Congestion control defines the methods for implicitly interpreting signals from the network in order for a sender to adjust its rate of transmission.
The term congestion control is a bit of a misnomer. Congestion avoidance would be a better term since TCP cannot control congestion per se. Ultimately intermediate devices, such as IP routers would only be able to control congestion.
Congestion control is currently a large area of research and concern in the network community. A companion study on congestion control examines the current state of activity in that area [9].
Timeouts and retransmissions handle error control in TCP. Although delay could be substantial, particularly if you were to implement real-time applications, the use of both techniques offer error detection and error correction thereby guaranteeing that data will eventually be sent successfully.
The nature of TCP and the underlying packet switched network provide formidable challenges for managers, designers and researchers of networks. Once regulated to low speed data communication applications, the Internet and in part TCP are being used to support very high speed communications of voice, video and data. It is unlikely that the Internet protocols will remain static as the applications change and expand. Understanding the current state of affairs will assist us in understanding protocol changes made to support future applications.
1.1 Transmission Control Protocol
TCP is often described as a byte stream, connection-oriented, reliable delivery transport layer protocol. In turn, we will discuss the meaning for each of these descriptive terms.
1.1.1 Byte Stream Delivery
TCP interfaces between the application layer above and the network layer below. When an application sends data to TCP, it does so in 8-bit byte streams. It is then up to the sending TCP to segment or delineate the byte stream in order to transmit data in manageable pieces to the receiver1. It is this lack of 'record boundaries" which give it the name "byte stream delivery service".
1.1.2 Connection-Oriented
Before two communicating TCPs can exchange data, they must first agree upon the willingness to communicate. Analogous to a telephone call, a connection must first be made before two parties exchange information.
1.1.3 Reliability
A number of mechanisms help provide the reliability TCP guarantees. Each of these is described briefly below.
Checksums. All TCP segments carry a checksum, which is used by the receiver to detect errors with either the TCP header or data.
Duplicate data detection. It is possible for packets to be duplicated in packet switched network; therefore TCP keeps track of bytes received in order to discard duplicate copies of data that has already been received.2
Retransmissions. In order to guarantee delivery of data, TCP must implement retransmission schemes for data that may be lost or damaged. The use of positive acknowledgements by the receiver to the sender confirms successful reception of data. The lack of positive acknowledgements, coupled with a timeout period (see timers below) calls for a retransmission.
Sequencing. In packet switched networks, it is possible for packets to be delivered out of order. It is TCP's job to properly sequence segments it receives so it can deliver the byte stream data to an application in order.
Timers. TCP maintains various static and dynamic timers on data sent. The sending TCP waits for the receiver to reply with an acknowledgement within a bounded length of time. If the timer expires before receiving an acknowledgement, the sender can retransmit the segment.
1.2 TCP Header Format
Remember that the combination of TCP header and TCP in one packet is called a TCP segment. Figure 1 depicts the format of all valid TCP segments. The size of the header without options is 20 bytes. We will briefly define each field of the TCP header below.
Figure 1 - TCP Header Format
1.2.1 Source Port
A 16-bit number identifying the application the TCP segment originated from within the sending host. The port numbers are divided into three ranges, well-known ports (0 through 1023), registered ports (1024 through 49151) and private ports (49152 through 65535). Port assignments are used by TCP as an interface to the application layer. For example, the TELNET server is always assigned to the well-known port 23 by default on TCP hosts. A complete pair of IP addresses (source and destination) plus a complete pair of TCP ports (source and destination) define a single TCP connection that is globally unique. See [5] for further details.
1.2.2 Destination Port
A 16-bit number identifying the application the TCP segment is destined for on a receiving host. Destination ports use the same port number assignments as those set aside for source ports [5].
1.2.3 Sequence Number
A 32-bit number identifying the current position of the first data byte in the segment within the entire byte stream for the TCP connection. After reaching 232 -1, this number will wrap around to 0.
1.2.4 Acknowledgement Number
A 32-bit number identifying the next data byte the sender expects from the receiver. Therefore, the number will be one greater than the most recently received data byte. This field is only used when the ACK control bit is turned on (see below).
1.2.5 Header Length
A 4-bit field that specifies the total TCP header length in 32-bit words (or in multiples of 4 bytes if you prefer). Without options, a TCP header is always 20 bytes in length. The largest a TCP header may be is 60 bytes. This field is required because the size of the options field(s) cannot be determined in advance. Note that this field is called "data offset" in the official TCP standard, but header length is more commonly used.
1.2.6 Reserved
A 6-bit field currently unused and reserved for future use.
1.2.7 Control Bits
Urgent Pointer (URG). If this bit field is set, the receiving TCP should interpret the urgent pointer field (see below).
Acknowledgement (ACK). If this bit field is set, the acknowledgement field described earlier is valid.
Push Function (PSH). If this bit field is set, the receiver should deliver this segment to the receiving application as soon as possible. An example of its use may be to send a Control-BREAK request to an application, which can jump ahead of queued data.
Reset the Connection (RST). If this bit is present, it signals the receiver that the sender is aborting the connection and all queued data and allocated buffers for the connection can be freely relinquished.
Synchronize (SYN). When present, this bit field signifies that sender is attempting to "synchronize" sequence numbers. This bit is used during the initial stages of connection establishment between a sender and receiver.
No More Data from Sender (FIN). If set, this bit field tells the receiver that the sender has reached the end of its byte stream for the current TCP connection.
1.2.8 Window
A 16-bit integer used by TCP for flow control in the form of a data transmission window size. This number tells the sender how much data the receiver is willing to accept. The maximum value for this field would limit the window size to 65,535 bytes, however a "window scale" option can be used to make use of even larger windows.
1.2.9 Checksum
A TCP sender computes a value based on the contents of the TCP header and data fields. This 16-bit value will be compared with the value the receiver generates using the same computation. If the values match, the receiver can be very confident that the segment arrived intact.
1.2.10 Urgent Pointer
In certain circumstances, it may be necessary for a TCP sender to notify the receiver of urgent data that should be processed by the receiving application as soon as possible. This 16-bit field tells the receiver when the last byte of urgent data in the segment ends.
1.2.11 Options
In order to provide additional functionality, several optional parameters may be used between a TCP sender and receiver. Depending on the option(s) used, the length of this field will vary in size, but it cannot be larger than 40 bytes due to the size of the header length field (4 bits). The most common option is the maximum segment size (MSS) option. A TCP receiver tells the TCP sender the maximum segment size it is willing to accept through the use of this option. Other options are often used for various flow control and congestion control techniques.
1.2.12 Padding
Because options may vary in size, it may be necessary to "pad" the TCP header with zeroes so that the segment ends on a 32-bit word boundary as defined by the standard [10].
1.2.13 Data
Although not used in some circumstances (e.g. acknowledgement segments with no data in the reverse direction), this variable length field carries the application data from TCP sender to receiver. This field coupled with the TCP header fields constitutes a TCP segment.
2. Connection Establishment and Termination
TCP provides a connection-oriented service over packet switched networks. Connection-oriented implies that there is a virtual connection between two endpoints.3 There are three phases in any virtual connection. These are the connection establishment, data transfer and connection termination phases.
2.1 Three-Way Handshake
In order for two hosts to communicate using TCP they must first establish a connection by exchanging messages in what is known as the three-way handshake. The diagram below depicts the process of the three-way handshake.
Figure 2 - TCP Connection Establishment
From figure 2, it can be seen that there are three TCP segments exchanged between two hosts, Host A and Host B. Reading down the diagram depicts events in time.
To start, Host A initiates the connection by sending a TCP segment with the SYN control bit set and an initial sequence number (ISN) we represent as the variable x in the sequence number field.
At some moment later in time, Host B receives this SYN segment, processes it and responds with a TCP segment of its own. The response from Host B contains the SYN control bit set and its own ISN represented as variable y. Host B also sets the ACK control bit to indicate the next expected byte from Host A should contain data starting with sequence number x+1.
When Host A receives Host B's ISN and ACK, it finishes the connection establishment phase by sending a final acknowledgement segment to Host B. In this case, Host A sets the ACK control bit and indicates the next expected byte from Host B by placing acknowledgement number y+1 in the acknowledgement field.
In addition to the information shown in the diagram above, an exchange of source and destination ports to use for this connection are also included in each senders' segments.4
2.2 Data Transfer
Once ISNs have been exchanged, communicating applications can transmit data between each other. Most of the discussion surrounding data transfer requires us to look at flow control and congestion control techniques which we discuss later in this document and refer to other texts [9]. A few key ideas will be briefly made here, while leaving the technical details aside.
A simple TCP implementation will place segments into the network for a receiver as long as there is data to send and as long as the sender does not exceed the window advertised by the receiver. As the receiver accepts and processes TCP segments, it sends back positive acknowledgements, indicating where in the byte stream it is. These acknowledgements also contain the "window" which determines how many bytes the receiver is currently willing to accept. If data is duplicated or lost, a "hole" may exist in the byte stream. A receiver will continue to acknowledge the most current contiguous place in the byte stream it has accepted.
If there is no data to send, the sending TCP will simply sit idly by waiting for the application to put data into the byte stream or to receive data from the other end of the connection.
If data queued by the sender reaches a point where data sent will exceed the receiver's advertised window size, the sender must halt transmission and wait for further acknowledgements and an advertised window size that is greater than zero before resuming.
Timers are used to avoid deadlock and unresponsive connections. Delayed transmissions are used to make more efficient use of network bandwidth by sending larger "chunks" of data at once rather than in smaller individual pieces.5
2.3 Connection Termination
In order for a connection to be released, four segments are required to completely close a connection. Four segments are necessary due to the fact that TCP is a full-duplex protocol, meaning that each end must shut down independently.6 The connection termination phase is shown in figure 3 below.
Figure 3 - TCP Connection Termination
Notice that instead of SYN control bit fields, the connection termination phase uses the FIN control bit fields to signal the close of a connection.
To terminate the connection in our example, the application running on Host A signals TCP to close the connection. This generates the first FIN segment from Host A to Host B. When Host B receives the initial FIN segment, it immediately acknowledges the segment and notifies its destination application of the termination request. Once the application on Host B also decides to shut down the connection, it then sends its own FIN segment, which Host A will process and respond with an acknowledgement.
3. Sliding Window and Flow Control
Flow control is a technique whose primary purpose is to properly match the transmission rate of sender to that of the receiver and the network. It is important for the transmission to be at a high enough rate to ensure good performance, but also to protect against overwhelming the network or receiving host.
In [8], we note that flow control is not the same as congestion control. Congestion control is primarily concerned with a sustained overload of network intermediate devices such as IP routers.
TCP uses the window field, briefly described previously, as the primary means for flow control. During the data transfer phase, the window field is used to adjust the rate of flow of the byte stream between communicating TCPs.
Figure 4 below illustrates the concept of the sliding window.
Figure 4 - Sliding Window
In this simple example, there is a 4-byte sliding window. Moving from left to right, the window "slides" as bytes in the stream are sent and acknowledged.7 The size of the window and how fast to increase or decrease the window size is an area of great research. We again refer to other documents for further detail [9].
4. Congestion Control
TCP congestion control and Internet traffic management issues in general is an active area of research and experimentation. This final section is a very brief summary of the standard congestion control algorithms widely used in TCP implementations today. These algorithms are defined in [6] and [7]. Their use with TCP was standardized in [1].
4.1 Slow Start
Slow Start, a requirement for TCP software implementations is a mechanism used by the sender to control the transmission rate, otherwise known as sender-based flow control. This is accomplished through the return rate of acknowledgements from the receiver. In other words, the rate of acknowledgements returned by the receiver determine the rate at which the sender can transmit data.
When a TCP connection first begins, the Slow Start algorithm initializes a congestion window to one segment, which is the maximum segment size (MSS) initialized by the receiver during the connection establishment phase. When acknowledgements are returned by the receiver, the congestion window increases by one segment for each acknowledgement returned. Thus, the sender can transmit the minimum of the congestion window and the advertised window of the receiver, which is simply called the transmission window.
Slow Start is actually not very slow when the network is not congested and network response time is good. For example, the first successful transmission and acknowledgement of a TCP segment increases the window to two segments. After successful transmission of these two segments and acknowledgements completes, the window is increased to four segments. Then eight segments, then sixteen segments and so on, doubling from there on out up to the maximum window size advertised by the receiver or until congestion finally does occur.
4.2 Congestion Avoidance
During the initial data transfer phase of a TCP connection the Slow Start algorithm is used. However, there may be a point during Slow Start that the network is forced to drop one or more packets due to overload or congestion. If this happens, Congestion Avoidance is used to slow the transmission rate. However, Slow Start is used in conjunction with Congestion Avoidance as the means to get the data transfer going again so it doesn't slow down and stay slow.
In the Congestion Avoidance algorithm a retransmission timer expiring or the reception of duplicate ACKs can implicitly signal the sender that a network congestion situation is occurring. The sender immediately sets its transmission window to one half of the current window size (the minimum of the congestion window and the receiver's advertised window size), but to at least two segments. If congestion was indicated by a timeout, the congestion window is reset to one segment, which automatically puts the sender into Slow Start mode. If congestion was indicated by duplicate ACKs, the Fast Retransmit and Fast Recovery algorithms are invoked (see below).
As data is received during Congestion Avoidance, the congestion window is increased. However, Slow Start is only used up to the halfway point where congestion originally occurred. This halfway point was recorded earlier as the new transmission window. After this halfway point, the congestion window is increased by one segment for all segments in the transmission window that are acknowledged. This mechanism will force the sender to more slowly grow its transmission rate, as it will approach the point where congestion had previously been detected.
4.3 Fast Retransmit
When a duplicate ACK is received, the sender does not know if it is because a TCP segment was lost or simply that a segment was delayed and received out of order at the receiver. If the receiver can re-order segments, it should not be long before the receiver sends the latest expected acknowledgement. Typically no more than one or two duplicate ACKs should be received when simple out of order conditions exist. If however more than two duplicate ACKs are received by the sender, it is a strong indication that at least one segment has been lost. The TCP sender will assume enough time has lapsed for all segments to be properly re-ordered by the fact that the receiver had enough time to send three duplicate ACKs.
When three or more duplicate ACKs are received, the sender does not even wait for a retransmission timer to expire before retransmitting the segment (as indicated by the position of the duplicate ACK in the byte stream). This process is called the Fast Retransmit algorithm and was first defined in [7]. Immediately following Fast Retransmit is the Fast Recovery algorithm.
4.4 Fast Recovery
Since the Fast Retransmit algorithm is used when duplicate ACKs are being received, the TCP sender has implicit knowledge that there is data still flowing to the receiver. Why? The reason is because duplicate ACKs can only be generated when a segment is received. This is a strong indication that serious network congestion may not exist and that the lost segment was a rare event. So instead of reducing the flow of data abruptly by going all the way into Slow Start, the sender only enters Congestion Avoidance mode.
Rather than start at a window of one segment as in Slow Start mode, the sender resumes transmission with a larger window, incrementing as if in Congestion Avoidance mode. This allows for higher throughput under the condition of only moderate congestion [23].
5. Conclusions
TCP is a fairly complex protocol that handles the brunt of functionality in a packet switched network such as the Internet. Supporting the reliable delivery of data on a packet switched network is not a trivial task. This document only scratches the surface of the TCP internals, but hopefully provided the reader with an appreciation and starting point for further interest in TCP. Even after almost 20 years of standardization, the amount of work that goes into supporting and designing reliable packet switched networks has not slowed. It is an area of great activity and there are many problems to be solved. As the Internet continues to grow, our reliance on TCP will become increasingly important. It is therefore imperative for network engineers, designers and researchers to be as well versed in the technology as possible.
- The word "segment" is the term used to describe TCP's data unit size transmitted to a receiver. TCP determines the appropriate use of this segment size rather than leaving it up to higher layer protocols and applications.
- Duplicate packets are typically caused by retransmissions, where the first packet may have been delayed and the second sent due to the lack of an acknowledgement. The receiver may then receive two identical packets.
- As opposed to a connectionless-oriented protocol such as that used by the user datagram protocol (UDP).
- There are additional details of the connection establishment, data transfer and termination phases that are beyond the scope of this document. For curious readers, I recommend consulting a more complete reference such as [4], [11] and of course the official standard RFC 793 [10].
- It was discovered early on that some implementations of TCP performed poorly due to this scenario. It has been termed the silly window syndrome and documented in [2].
- Although it is possible, it is not very common for TCP to be operating in the "half-close state". See [11] for further details.
- We assume in this example that bytes are immediately acknowledged so that the window can move forward. In practice the sender's window shrinks and grows dynamically as acknowledgements arrive in time.
Abbreviations
ACK | Acknowledgement |
bit | binary digit |
IETF | Internet Engineering Task Force |
IP | Internet Protocol |
ISN | Initial Sequence Number |
RFC | Request For Comments |
TCP | Transmission Control Protocol |
TCP/IP | Transmission Control Protocol/Internet Protocol |
UDP | User Datagram Protocol |
References
[1] | Robert Braden. Requirements for Internet Hosts - Communication Layers, October 1989, RFC 1122. |
[2] | David D. Clark. Window Acknowledgement and Strategy in TCP, July 1982, RFC 813. |
[3] | David D. Clark. The Design Philosophy of the DARPA Internet Protocols. In Proceedings SIGCOMM '88, Computer Communications Review Vol. 18, No. 4, August 1988, pp. 106-114). |
[4] | Douglas E. Comer. Internetworking with TCP/IP, Volume I: Principles, Protocols and Architecture. Prentice Hall, ISBN: 0-13-216987-8. March 24, 1995. |
[5] | Internet Assigned Numbers Authority. Port Number Assignment, February 2000. |
[6] | Van Jacobson. Congestion Avoidance and Control. Computer Communications Review, Volume 18 number 4, pp. 314-329, August 1988. |
[7] | Van Jacobson. Modified TCP Congestion Control Avoidance Algorithm. end-2-end-interest mailing list, April 30, 1990. |
[8] | S. Keshav. An Engineering Approach to Computer Networking: ATM Networks, the Internet, and the Telephone Network. Addison Wesley, ISBN: 0-201-63442-2. July, 1997. |
[9] | John Kristoff. TCP Congestion Control, March 2000. |
[10] | Jon Postel. Transmission Control Protocol, September 1981, RFC 793. |
[11] | W. Richard Stevens. TCP/IP Illustrated, Volume 1: The Protocols. Addison Wesley, ISBN: 0-201-63346-9. January 1994. |
Goto John Kristoff's Home Page
Last updated: April 24, 2000
相关推荐
**三次握手** 是TCP连接建立过程中不可或缺的步骤,它确保了两个主机之间可以正确建立连接。具体过程如下: 1. **SYN(同步序列编号)**:客户端首先发送一个带有SYN标志的数据段,其中包含一个随机生成的序列号A。...
TCP(Transmission Control Protocol,传输控制协议)是一种可靠的连接服务,采用三次握手确认建立一个连接。在本节中,我们将详细介绍 TCP 三次握手和四次挥手机制的原理和过程。 TCP 三次握手 TCP 三次握手是指...
1. **TCP三次握手**: - 第一次握手:客户端发送一个SYN(同步序列编号)报文段到服务器,请求建立连接。这个报文段包含客户端的初始序列号。 - 第二次握手:服务器接收到SYN报文后,回应一个SYN+ACK(同步+确认)...
1. **面向连接**:TCP在数据传输前需要建立连接,这个过程被称为三次握手,确保双方都能正常通信。连接建立后,通信双方才能进行数据交换。 2. **点对点通信**:每个TCP连接都有两个端点,即源和目的主机的端口,...
一、TCP三次握手 三次握手是为了确保数据传输的双方都能正常通信,避免“已失效的连接请求报文段”被误解为新的连接请求。具体过程如下: 1. 第一次握手:客户端(发起方)向服务器(接收方)发送一个SYN(同步...
- TCP(Transmission Control Protocol)是一种面向连接的、可靠的传输层协议,它通过三次握手建立连接,提供全双工通信,并确保数据的顺序和完整性。 - TCP的核心特性之一就是其流量控制机制,这主要通过滑动窗口...
本文将深入解析TCP协议中至关重要的"三次握手"和"四次挥手"过程,帮助你全面理解这两个核心概念。 ### 三次握手 三次握手是TCP建立连接时必须遵循的步骤,其主要目的是为了防止已失效的连接请求报文突然又传到了...
TCP连接的建立过程称为三次握手(Three-Way Handshake)。首先,客户端发送一个带有SYN标志的TCP报文段给服务器,请求建立连接。服务器接收到后回应一个SYN+ACK报文段,确认并提出自己的连接请求。最后,客户端再次...
总结起来,本资料主要涉及TCP的三次握手建立连接、四次挥手断开连接的过程,滑动窗口的概念以及网络协议、服务器、并发服务器的实现,包括多进程和多线程模型,以及在实际应用中如nginx的使用。理解并掌握这些知识点...
《TCP与Socket详解——从三次握手到网络通信的深度剖析》 在计算机网络的世界里,TCP(Transmission Control Protocol)和Socket是两个至关重要的概念。TCP是传输控制协议,它是互联网协议族中的核心部分,主要负责...
以下是TCP三次握手的详细过程: 1. **第一次握手**:客户端想要连接服务器时,会发送一个SYN(同步序列编号)报文段。这个报文中包含客户端选择的初始序列号X,并请求建立连接。此时,客户端进入SYN_SENT状态。 2....
在数据传输开始之前,TCP利用三次握手过程来建立连接。这一过程涉及三个步骤,确保了通信双方都有能力接收和发送数据。当客户端希望与服务器建立连接时,首先发送一个带有SYN标志的数据包,其中包含一个随机的序列号...
TCP协议通过一系列复杂的机制确保了数据的正确传输,包括三次握手建立连接、滑动窗口流量控制、确认应答、重传机制以及拥塞控制等。下面我们将详细探讨这些知识点。 首先,TCP连接的建立是一个“三次握手”的过程。...
TCP通过三次握手建立连接,确保数据传输的可靠性,采用滑动窗口机制来控制流量并实现拥塞避免,还具有错误检测和重传机制,以确保数据的正确传输。 在“基于TCP协议的五子棋”项目中,TCP协议被用来实现实时的玩家...
2. **滑动窗口协议**:滑动窗口协议是一种流量控制机制,通过调整窗口大小来控制发送速率。当接收方通告较大的窗口时,表示它可以接收更多数据;反之,则会限制发送方的发送速率。 3. **拥塞控制**:TCP拥塞控制...
4. 连接建立与终止:TCP采用三次握手建立连接,确保双方都有能力接收和发送数据。四次挥手释放连接,确保数据传输完成后双方可以安全断开连接。 C++源码分析: 这份C++源码实现了TCP协议的关键功能,包括连接管理...
- **三次握手**:用于建立TCP连接。第一次握手是由客户端发送SYN(同步序列编号)请求到服务器;第二次握手是服务器回应客户端的SYN请求,并且发送自己的SYN给客户端;第三次握手是客户端发送ACK确认报文给服务器,...