套接字发送调用阻塞这么长时间 [英] socket send call getting blocked for so long

查看:178
本文介绍了套接字发送调用阻塞这么长时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每隔10秒在套接字上发送2个字节的应用程序数据(阻塞),但是发送的调用在下面的例子中被阻塞超过40秒。




  • 2012-06-13 12:02:46.653417 | INFO |发送前

  • 2012-06-13 12:02:46.653457 | INFO |发送后(2)

  • 2012-06-13 12:02:57.566898 | INFO |发送前

  • 2012-06-13 12:02 :57.566962 | INFO | after send(2)

  • 2012-06-13 12:03:08.234060 | INFO |发送前

  • 2012-06 -13 12:03:08.234101 | INFO |发送后(2)

  • ** 2012-06-13 12:03:19.010743 | INFO |发送前

  • 2012-06-13 12:04:00.969162 | INFO |发送后(2)**



发送缓冲区大小在机器(linux)是65536。



2字节数据是心跳与服务器和服务器期望客户端每隔15秒发送HB一次。 / p>

此外,我没有禁用naggle的算法。



问题是 - 40秒?



我知道的发送调用只需将数据复制到TCP发送缓冲区。



publish每10秒调用一次。没有它没有逐渐减慢发送调用。它发生一次突然,然后由于在另一侧的套接字被关闭,所以应用程序退出。

  int publish(char * buff,int size)const {
/ *将0x0A加到结尾*
buff [size] = _ eolchar;

if(_debugMode)
{
ACE_DEBUG((MY_INFObefore send\\\
));
}

int ret = _socket.send((void *)buff,size + 1);

if(_debugMode)
{
ACE_DEBUG((MY_INFOafter send(%d)\\\
,ret)
// std :: cout< after send<< ret<< std :: endl;
}

if(ret <1)
{
ACE_DEBUG((MY_ERRORSocket error,FH going down \\\
));
ACE_OS :: sleep(1);
abort();
}
return ret;
}


解决方案

使用阻止 send()调用,在你的应用程序的角度来看,你可以想到远程TCP缓冲区,网络和本地发送TCP缓冲区,作为一个大缓冲区。



也就是说,如果远程应用程序从其TCP缓冲区读取新字节时出现延迟,则最终本地TCP缓冲区将变为(几乎)已满。如果你尝试 send()一个新的有效载荷溢出TCP缓冲区, send()实现系统调用)不会将焦点返回到您的应用程序,直到TCP缓冲区获得足够的空间来存储该有效负载。



达到该状态的唯一方法是远程应用程序读取不足的字节。在测试环境中的典型场景是当远程应用程序在断点上暂停时...: - )



这是我们所说的 SLOW CONSUMER 问题。如果您共享该诊断,则有多种方法可以解决该问题:


  1. 如果您可以控制远程应用程序, 足以使本地应用程序不会被阻止。

  2. 如果您没有远程应用程序的控制,答案:


    • 它可以确定您自己的需要,阻止长达40秒。

    • 如果不是这样,您需要使用解除封锁版本的 send()系统调用。从这里,有多种可能的政策;我在下面描述一个。 (请继续!:-))


您可以尝试使用动态数组,它作为假发送TCP FIFO,并在发送调用返回 EWOULDBLOCK 时增长。在这种情况下,您可能必须使用 select()系统调用来检测远程应用程序何时跟上速度,并首先发送未知数据。



这里有一些简单的 publish()大多数网络应用)。你必须知道,也不能保证动态缓冲区增长到你不再有任何可用内存的地步,然后你的本地应用程序可能崩溃。 实时网络应用程序中的典型策略是为达到时关闭TCP连接的缓冲区选择任意最大大小,从而避免本地应用程序耗尽可用内存。选择最大明智,因为它取决于潜在慢消费者连接的数量。


I send 2 bytes of app data on the socket(blocking) every 10 seconds, but the send call got blocked in the last instance in below long for more than 40 seconds.

  • 2012-06-13 12:02:46.653417|INFO|before send
  • 2012-06-13 12:02:46.653457|INFO|after send (2)
  • 2012-06-13 12:02:57.566898|INFO|before send
  • 2012-06-13 12:02:57.566962|INFO|after send (2)
  • 2012-06-13 12:03:08.234060|INFO|before send
  • 2012-06-13 12:03:08.234101|INFO|after send (2)
  • **2012-06-13 12:03:19.010743|INFO|before send
  • 2012-06-13 12:04:00.969162|INFO|after send (2)**

The tcp default send buffer size on machine(linux) is 65536.

The 2 bytes data is to heartbeat with a server and server expects client to send HB once atleast every 15 seconds.

Also, I did not disable naggle's algorithm.

The question is - can the send call blocked so long like 40 secs? And it is happening only sporadically, it happened after close to 12 hours of running.

The send call I know should just copy the data to TCP send buffer.

publish is called every 10 seconds. No its not gradual slow down of send call. It happens once suddenly and then due to that socket on other side gets closed, so the app exits.

int publish(char* buff, int size) const {
      /* Adds the 0x0A to the end */
      buff[size]=_eolchar;

      if (_debugMode)
      {
          ACE_DEBUG((MY_INFO "before send\n"));
      }

      int ret = _socket.send((void*)buff, size+1);

      if (_debugMode)
      {
          ACE_DEBUG((MY_INFO "after send (%d)\n", ret));
          //std::cout << "after send " << ret << std::endl;
      }

      if (ret < 1)
      {
          ACE_DEBUG((MY_ERROR "Socket error, FH going down\n"));
          ACE_OS::sleep(1);
          abort();
      }
      return ret;
 }

解决方案

When using the blocking send() call, in the viewpoint of your application, you can think of the remote TCP buffer, the network and the local sending TCP buffer, as one big buffer.

That is, if the remote application gets delayed in reading new bytes from its TCP buffer, eventually your local TCP buffer will become (nearly) full. If you try to send() a new payload that overflows the TCP buffer, the send() implementation (the kernel system call) won't return the focus to your application until the TCP buffer gets enough room to store that payload.

The only way to reach that state is when the remote application does not read enough bytes. A typical scenario in test environment is when the remote application pauses on a breakpoint ... :-)

This is what we call a SLOW CONSUMER issue. If you share that diagnosis, then there are multiple ways of getting rid of that issue:

  1. If you have control over the remote application, make it fast enough so that the local application won't get blocked.
  2. If you don't have the control of the remote application, then there could be multiple answers:
    • It can be ok for your own needs to block up to 40 seconds.
    • If not so, you need to use an unblocking version of the send() system call. From here, there are multiple possible policies; I describe one below. (Hold on please! :-) )

You can try to use a dynamic array which acts as a fake sending TCP FIFO and grows when the sending call returns you EWOULDBLOCK. In this case you likely have to use the select() system call to detect when the remote application keeps up with the pace and send it the unseen data first.

It can be a little bit trickier that the simple publish() function you have here (while quite common in most of network applications). You have to know also there is no guarantee that the dynamic buffer grows to the point you no longer have any free memory, and then your local application could crash. A typical policy in "real-time" network application is to choose an arbitrary maximum size for the buffer which close the TCP connection when reached, thus avoiding your local application to get run out of free memory. Choose that max wisely, since it depends on the number of potential slow consumer connections.

这篇关于套接字发送调用阻塞这么长时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆