`

Socket中的TIME_WAIT状态

阅读更多

                                 Socket中的TIME_WAIT状态
在 高并发短连接的server端,当server处理完client的请求后立刻closesocket此时会出现time_wait状态然后如果 client再并发2000个连接,此时部分连接就连接不上了,用linger强制关闭可以解决此问题,但是linger会导致数据丢失,linger值 为0时是强制关闭,无论并发多少多能正常连接上,如果非0会发生部分连接不上的情况!(可调用setsockopt设置套接字的linger延时标志,同时将延时时间设置为0。
TCP/IP的RFC文档。TIME_WAIT是TCP连接断开时必定会出现的状态。是无法避免掉的,这是TCP协议实现的一部分。在WINDOWS下,可以修改注册表让这个时间变短一些,time_wait的时间为2msl,默认为4min.你可以通过改变这个变量:
TcpTimedWaitDelay 
把它缩短到30s

TCP 要保证在所有可能的情况下使得所有的数据都能够被投递。当你关闭一个socket时,主动关闭一端的socket将进入TIME_WAIT状态,而被动关 闭一方则转入CLOSED状态,这的确能够保证所有的数据都被传输。当一个socket关闭的时候,是通过两端互发信息的四次握手过程完成的,当一端调用 close()时,就说明本端没有数据再要发送了。这好似看来在握手完成以后,socket就都应该处于关闭CLOSED状态了。但这有两个问题,首先, 我们没有任何机制保证最后的一个ACK能够正常传输,第二,网络上仍然有可能有残余的数据包(wandering duplicates),我们也必须能够正常处理。
通过正确的状态机,我们知道双方的关闭过程如下图

假设最后一个ACK丢失了,服务器会重发它发送的最后一个FIN,所以客户端必须维持一个状态信息,以便能够重发ACK;如果不维持这种状态,客户端在接收 到FIN后将会响应一个RST,服务器端接收到RST后会认为这是一个错误。如果TCP协议能够正常完成必要的操作而终止双方的数据流传输,就必须完全正 确的传输四次握手的四个节,不能有任何的丢失。这就是为什么socket在关闭后,仍然处于 TIME_WAIT状态,因为他要等待以便重发ACK。 
如 果目前连接的通信双方都已经调用了close(),假定双方都到达CLOSED状态,而没有TIME_WAIT状态时,就会出现如下的情况。现在有一个新 的连接被建立起来,使用的IP地址与端口与先前的完全相同,后建立的连接又称作是原先连接的一个化身。还假定原先的连接中有数据报残存于网络之中,这样新 的连接收到的数据报中有可能是先前连接的数据报。为了防止这一点,TCP不允许从处于TIME_WAIT状态的socket建立一个连接。处于 TIME_WAIT状态的socket在等待两倍的MSL时间以后(之所以是两倍的MSL,是由于MSL是一个数据报在网络中单向发出到认定丢失的时间, 一个数据报有可能在发送图中或是其响应过程中成为残余数据报,确认一个数据报及其响应的丢弃的需要两倍的MSL),将会转变为CLOSED状态。这就意味 着,一个成功建立的连接,必然使得先前网络中残余的数据报都丢失了。 
由于TIME_WAIT状态所带来的相关问题,我们可以通过设置SO_LINGER标志来避免socket进入TIME_WAIT状态,这可以通过发送RST而取代正常的TCP四次握手的终止方式。但这并不是一个很好的主意,TIME_WAIT对于我们来说往往是有利的。

客户端与服务器端建立TCP/IP连接后关闭SOCKET后,服务器端连接的端口
状态为TIME_WAIT

是不是所有执行主动关闭的socket都会进入TIME_WAIT状态呢?
有没有什么情况使主动关闭的socket直接进入CLOSED状态呢?

主动关闭的一方在发送最后一个 ack 后就会进入 TIME_WAIT 状态 停留2MSL(max segment lifetime)时间这个是TCP/IP必不可少的,也就是“解决”不了的。也就是TCP/IP设计者本来是这么设计的
主要有两个原因:1。防止上一次连接中的包,迷路后重新出现,影响新连接  (经过2MSL,上一次连接中所有的重复包都会消失)
2。可靠的关闭TCP连接  在主动关闭方发送的最后一个 ack(fin) ,有可能丢失,这时被动方会重新发
  fin, 如果这时主动方处于 CLOSED 状态 ,就会响应 rst 而不是 ack。所以  主动方要处于 TIME_WAIT 状态,而不能是 CLOSED 。TIME_WAIT 并不会占用很大资源的,除非受到攻击。还有,如果一方 send 或 recv 超时,就会直接进入 CLOSED 状态。

socket-faq中的这一段讲的也很好,摘录如下:
2.7. Please explain the TIME_WAIT state.
Remember
that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it 
seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully.
Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered. Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the
closing sequence in the following usenet posting:
Assume that a connection is in ESTABLISHED state, and the client is
about to do an orderly release. The client's sequence no. is Sc, and
the server's is Ss. Client Server
====== ======
ESTABLISHED ESTABLISHED
(client closes)
ESTABLISHED ESTABLISHED
------->>
FIN_WAIT_1
>
TIME_WAIT CLOSED
(2*msl elapses...)
CLOSED
Note: the +1 on the sequence numbers is because the FIN counts as one
byte of data. (The above diagram is equivalent to fig. 13 from RFC793).
Now consider what happens if the last of those packets is dropped in the network. The client has done with the connection; it has no more data or control info to send, and never will have. But the server does not know whether the client received all the data correctly; that's
what the last ACK segment is for. Now the server may or may not care whether the client got the data, but that is not an issue for TCP; TCP is a reliable rotocol, and must distinguish between an orderly connection close where all data is transferred, and a connection abort
where data may or may not have been lost.
So, if that last packet is dropped, the server will retransmit it (it is, after all, an unacknowledged segment) and will expect to see a suitable ACK segment in reply. If the client went straight to CLOSED, the only possible response to that retransmit would be a RST, which would indicate to the server that data had been lost, when in fact it
had not been. (Bear in mind that the server's FIN segment may, additionally, contain
data.) DISCLAIMER: This is my interpretation of the RFCs (I have read all the
TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though. More commentarty from Vic: The second issue was addressed by Richard Stevens (rstevens@noao.edu, author of "Unix Network Programming", see ``1.5 Where can I get source  
code for the book [book title]?''). I have put together quotes from some of his postings and email which explain this. I have brought together paragraphs from different postings, and have made as few changes as possible.
From Richard Stevens (rstevens@noao.edu):
If the duration of the TIME_WAIT state were just to handle TCP's full-duplex close, then the time would be much smaller, and it would be some function of the current RTO (retransmission timeout), not the MSL (the packet lifetime).
A couple of points about the TIME_WAIT state.
o The end that sends the first FIN goes into the TIME_WAIT state, because that is the end that sends the final ACK. If the other end's FIN is lost, or if the final ACK is lost, having the end that sends the first FIN maintain state about the connection guarantees that it has enough information to retransmit the final ACK. o Realize that TCP sequence numbers wrap around after 2**32 bytes have been transferred. Assume a connection between A.1500 (host A, port 1500) and B.2000. During the connection one segment is lost and retransmitted. But the segment is not really lost, it is held by some intermediate router and then re-injected into the network. (This is called a "wandering duplicate".) But in the time between the packet being lost & retransmitted, and then reappearing, the connection is closed (without any problems) and then another connection is established between the same host, same port (that is, A.1500 and B.2000; this is called another "incarnation" of the
connection). But the sequence numbers chosen for the new incarnation just happen to overlap with the sequence number of the wandering duplicate that is about to reappear. (This is indeed possible, given the way sequence numbers are chosen for TCP connections.) Bingo, you are about to deliver the data from the wandering duplicate (the previous incarnation of the connection) to  
the new incarnation of the connection. To avoid this, you do not allow the same incarnation of the connection to be reestablished

until the TIME_WAIT state terminates.
Even the TIME_WAIT state doesn't complete solve the second problem,
given what is called TIME_WAIT assassination. RFC 1337 has more
details.
o The reason that the duration of the TIME_WAIT state is 2*MSL is
that the maximum amount of time a packet can wander around a
network is assumed to be MSL seconds. The factor of 2 is for the
round-trip. The recommended value for MSL is 120 seconds, but
Berkeley-derived implementations normally use 30 seconds instead.
This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x
does indeed use the recommended MSL of 120 seconds.
A wandering duplicate is a packet that appeared to be lost and was
retransmitted. But it wasn't really lost ... some router had
problems, held on to the packet for a while (order of seconds, could
be a minute if the TTL is large enough) and then re-injects the packet
back into the network. But by the time it reappears, the application
that sent it originally has already retransmitted the data contained
in that packet.
Because of these potential problems with TIME_WAIT assassinations, one
should not avoid the TIME_WAIT state by setting the SO_LINGER option
to send an RST instead of the normal TCP connection termination
(FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's
your friend and it's there to help you :-)
I have a long discussion of just this topic in my just-released
"TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of
the most misunderstood features of TCP.
I'm currently rewriting "Unix Network Programming" (see ``1.5 Where
can I get source code for the book [book title]?''). and will include
lots more on this topic, as it is often confusing and misunderstood.
An additional note from Andrew:
Closing a socket: if SO_LINGER has not been called on a socket, then
close() is not supposed to discard data. This is true on SVR4.2 (and,
apparently, on all non-SVR4 systems) but apparently not on SVR4; the
use of either shutdown() or SO_LINGER seems to be required to
guarantee delivery of all data.

分享到:
评论

相关推荐

    TCP状态迁移,CLOSE_WAIT & FIN_WAIT2 的问题解决

    在 TCP 连接中,客户端和服务器端都可以处于不同的状态,例如 ESTABLISHED、CLOSE_WAIT、FIN_WAIT_1、FIN_WAIT_2、TIME_WAIT 等 trạng thái。 CLOSE_WAIT 状态是 TCP 连接中的一种状态,它表示服务器端已经收到了...

    解决TIME_WAIT过多造成的问题1

    例如,在Java中,可以通过设置SOCKET选项来调整TIME_WAIT状态的处理方式,比如使用SO_LINGER选项来改变TCP关闭连接的行为,或者使用SO_REUSEADDR选项来允许端口重用,尽管这样做可能会增加因为重复分组造成的风险。...

    windows 2008 R2解决socket连接不释放补丁包_time-wait过多注册表改.rar

    修改注册表中的tcpip的TIMEWAIT回收时间属性值,需要重启后生效 在HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters,添加名为TcpTimedWaitDelay的DWORD键,设置为十进制0,以缩短TIME_WAIT...

    TIME_WAIT.rar_C-means_linux 网络状态_linux c wait_tcp_unix 网络编程

    在Linux系统中,进行网络编程时,TCP连接的TIME_WAIT状态是至关重要的一个环节。TIME_WAIT状态是TCP连接生命周期中的最后一个阶段,对于理解和优化网络应用性能有着直接的影响。本资源"TIME_WAIT.rar"包含了关于这个...

    系统调优,你所不知道的TIME_WAIT和CLOSE_WAIT1

    虽然TIME_WAIT状态有助于保证连接的可靠性,但它也可能导致资源占用过多,尤其是在高并发的服务器环境中。大量的TIME_WAIT连接会使可用的TCP端口资源耗尽,从而限制了新的连接建立。这也是为什么有些人会建议调整...

    Linux大量TIME_WAIT解决办法.docx

    在Linux系统中,当观察到大量TCP连接处于TIME_WAIT状态时,这通常意味着系统的TCP连接管理存在效率问题,可能导致资源耗尽,特别是对于高并发的服务,例如MySQL数据库服务器。TIME_WAIT状态是TCP连接生命周期中的一...

    减少Linux服务器过多的TIME_WAIT

    在Linux服务器环境中,当TCP/IP连接关闭后,服务器端的端口可能会进入TIME_WAIT状态,这是TCP协议设计的一部分。TIME_WAIT状态的目的是确保网络中不存在旧的、可能重复的数据包,从而避免对新连接造成干扰,并确保...

    TCP TIME_WAIT常见解决方法-hanwei_1049-ChinaUnix博客1

    TCP TIME_WAIT状态是TCP连接生命周期中的一个重要阶段,它发生在主动关闭连接的一方(通常称为客户端)在连接关闭后等待一段时间,以确保所有在网络中可能残留的数据片段都被接收并确认。这个阶段的存在是为了避免旧...

    [服务器性能优化]Linux下高并发socket最大连接数和sysctl(time_wait)设置

    本文将深入探讨如何优化Linux下的socket连接数以及如何调整sysctl参数,特别是time_wait状态的影响。这有助于提升服务器处理大量并发请求的能力,确保服务的稳定性和响应速度。 首先,我们需要了解socket连接数的...

    Netstat命令详解如何关闭TIME_WAIT连接如何查看nginx的访问流量[归类].pdf

    4. 关闭 TIME_WAIT 连接:使用 netstat -ant | grep TIME_WAIT | awk '{print $2}' | xargs kill -9 命令可以关闭 TIME_WAIT 连接。 网络连接状态详解: 网络连接状态可以分为 12 种可能的状态,前面 11 种是按照 ...

    TCP_SYNC基础

    TIME_WAIT 状态是一个非常重要的状态,它的主要目的是等待足够的时间让 ACK 包到达服务器端,以便确保连接的关闭。如果服务器端没有收到 ACK 包,会重新发送 FIN 包,直到服务器收到 ACK 包。TIME_WAIT 状态的等待...

    TCP状态转换图1

    6. **TIME_WAIT**: 主动关闭方发送ACK后,进入TIME_WAIT状态,等待足够时间确保对方收到ACK,然后进入CLOSED状态。这个状态的存在是为了防止旧的重复数据包被误解释为新的数据。 TIME_WAIT状态的重要性在于: - 它...

    tcp状态解析和windowsio说明

    9. **TIME_WAIT**: 发送最后的ACK包后,套接字进入TIME_WAIT状态,等待2MSL(最大分段生存时间)以确保网络中没有丢失的ACK。这是为了防止旧的数据包在网络中重新出现,导致新连接的混淆。 10. **CLOSE_WAIT**: ...

    [Socket]CLOSEWAIT.rar_Help!

    总的来说,Socket编程中的CLOSE_WAIT状态是网络通信中的常见问题,理解其含义并掌握处理方法对于开发稳定、高效的网络服务至关重要。通过阅读并分析【Socket]CLOSEWAIT.mht】文件,你应该能够找到具体问题的解决方案...

    mysql占用率达到99%

    3. **分析TIME_WAIT状态的连接原因**:TIME_WAIT状态是TCP连接生命周期中的一个正常阶段,但它会占用一定的系统资源。在高并发场景下,如果大量的连接都停留在TIME_WAIT状态,可能会导致新的连接无法建立,从而影响...

    setsockopt 设置socket 详细用法.doc

    在某些情况下,我们可能希望socket在调用`closesocket`后立即关闭,而不是经历TIME_WAIT状态。这可以通过设置`SO_DONTLINGER`选项来实现,确保socket的快速释放。 ```c BOOL bDontLinger = FALSE; setsockopt(s, ...

    Socket状态变迁图

    最初发送FIN的一方在收到对方的FIN包后进入TIME_WAIT状态,等待一段时间确保对方收到ACK,然后进入CLOSED状态,表示连接完全关闭。另一方在收到ACK后直接进入CLOSED状态。 以上就是Socket状态变迁的基本流程。在...

    socket 编程问题一览(01)1

    在这个话题中,我们将探讨几个关键的步骤和概念,包括`socket()`、`bind()`、`TIME_WAIT`状态以及`SO_REUSEADDR`选项。 首先,`socket()`函数用于创建一个socket,返回一个描述符,这个描述符就像文件描述符一样,...

Global site tag (gtag.js) - Google Analytics