连接到套接字时似乎无法正常工作 [英] Can't seem to get a timeout working when connecting to a socket

查看:95
本文介绍了连接到套接字时似乎无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为connect()提供超时。我四处搜寻,并找到了与此相关的几篇文章。我已经编写了我认为应该可以使用的代码,但是很遗憾,getsockopt()没有报告任何错误。但是然后当我进入write()时,它的错误号为107-ENOTCONN。

I'm trying to supply a timeout for connect(). I've searched around and found several articles related to this. I've coded up what I believe should work but unfortunately I get no error reported from getsockopt(). But then when I come to the write() it fails with an errno of 107 - ENOTCONN.

几点。我正在Fedora 23上运行。connect()的文档说,它应该返回错误,错误码为EINPROGRESS,表示连接尚未完成,但是我遇到了EAGAIN,所以我将其添加到了检查中。目前,我的套接字服务器在listen()调用中将积压设置为零。许多调用都成功了,但是失败的都失败了,我在write()调用中提到了107-ENOTCONN。

A couple of points. I'm running on Fedora 23. The docs for connect() says it should return failure with an errno of EINPROGRESS for a connect that is not complete yet however I was experiencing EAGAIN so I added that to my check. Currently my socket server is setting the backlog to zero in the listen() call. Many of the calls succeed but the ones that fail all fail with the 107 - ENOTCONN I had mentioned in the write() call.

我希望我只是

int domain_socket_send(const char* socket_name, unsigned char* buffer,
        unsigned int length, unsigned int timeout)
{
    struct sockaddr_un addr;
    int fd = -1;
    int result = 0;

    // Create socket.

    fd = socket(AF_UNIX, SOCK_STREAM, 0);
    if (fd == -1)
        {
        result = -1;
        goto done;
        }

    if (timeout != 0)
        {

        // Enabled non-blocking.

        int flags;
        flags = fcntl(fd, F_GETFL);
        fcntl(fd, F_SETFL, flags | O_NONBLOCK);
        }

    // Set socket name.

    memset(&addr, 0, sizeof(addr));
    addr.sun_family = AF_UNIX;
    strncpy(addr.sun_path, socket_name, sizeof(addr.sun_path) - 1);

    // Connect.

    result = connect(fd, (struct sockaddr*) &addr, sizeof(addr));
    if (result == -1)
        {

        // If some error then we're done.

        if ((errno != EINPROGRESS) && (errno != EAGAIN))
            goto done;

        fd_set write_set;
        struct timeval tv;

        // Set timeout.

        tv.tv_sec = timeout / 1000000;
        tv.tv_usec = timeout % 1000000;

        unsigned int iterations = 0;
        while (1)
            {
            FD_ZERO(&write_set);
            FD_SET(fd, &write_set);

            result = select(fd + 1, NULL, &write_set, NULL, &tv);
            if (result == -1)
                goto done;
            else if (result == 0)
                {
                result = -1;
                errno = ETIMEDOUT;
                goto done;
                }
            else
                {
                if (FD_ISSET(fd, &write_set))
                    {
                    socklen_t len;
                    int socket_error;
                    len = sizeof(socket_error);

                    // Get the result of the connect() call.

                    result = getsockopt(fd, SOL_SOCKET, SO_ERROR,
                            &socket_error, &len);
                    if (result == -1)
                        goto done;

                    // I think SO_ERROR will be zero for a successful
                    // result and errno otherwise.

                    if (socket_error != 0)
                        {
                        result = -1;
                        errno = socket_error;
                        goto done;
                        }

                    // Now that the socket is writable issue another connect.

                    result = connect(fd, (struct sockaddr*) &addr,
                            sizeof(addr));
                    if (result == 0)
                        {
                        if (iterations > 1)
                            {
                            printf("connect() succeeded on iteration %d\n",
                                    iterations);
                            }
                        break;
                        }
                    else
                        {
                        if ((errno != EAGAIN) && (errno != EINPROGRESS))
                            {
                            int err = errno;
                            printf("second connect() failed, errno = %d\n",
                                    errno);
                            errno = err;
                            goto done;
                            }
                        iterations++;
                        }
                    }
                }
            }
        }

    // If we put the socket in non-blocking mode then put it back
    // to blocking mode.

    if (timeout != 0)
        {

        // Turn off non-blocking.

        int flags;
        flags = fcntl(fd, F_GETFL);
        fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
        }

    // Write buffer.

    result = write(fd, buffer, length);
    if (result == -1)
        {
        int err = errno;
        printf("write() failed, errno = %d\n", err);
        errno = err;
        goto done;
        }

done:
    if (result == -1)
        result = errno;
    else
        result = 0;
    if (fd != -1)
        {
        shutdown(fd, SHUT_RDWR);
        close(fd);
        }
    return result;
}

更新04/05/2016:

UPDATE 04/05/2016:

我突然意识到也许我需要多次调用connect()直到成功,毕竟这是非阻塞io而不是异步io。就像在遇到read()上的EAGAIN后要读取的数据时,我必须再次调用read()一样。此外,我发现了以下SO问题:

It dawned on me that maybe I need to call connect() multiple times until successful, after all this is non-blocking io not async io. Just like I have to call read() again when there is data to read after encountering an EAGAIN on a read(). In addition, I found the following SO question:

Using select() for non-blocking sockets to connect always returns 1

,其中EJP的回答说,您需要发出多个connect()。另外,从EJP书中引用:

in which EJP's answer says you need to issue multiple connect()'s. Also, from the book EJP references:

https://books.google.com/books?id=6H9AxyFd0v0C&pg=PT681& ; lpg = PT681& dq = stevens + and + wright + tcp / ip + illustrated + non-blocking + connect& source = bl& ots = b6kQar6SdM& sig = kt5xZubPZ2atVxs2VQU4mu7NGUI& hl = en& Sa = X& AwmhA = A&A = A&A = 1A&A = A&A = A&A = 8 = onepage& q = stevens%20and%20wright%20tcp%2Fip%20illustrated%20non-blocking%20connect& f = false

这似乎表明您需要发出多个connect()。我已修改此问题中的代码段以调用connect()直到成功。我可能仍需要进行更改以可能更新传递给select()的超时值,但这不是我的直接问题。

it seems to indicate you need to issue multiple connect()'s. I've modified the code snippet in this question to call connect() until it succeeds. I probably still need to make changes around possibly updating the timeout value passed to select(), but that's not my immediate question.

多次调用connect()似乎有解决了我原来的问题,那就是我在调用write()时得到了ENOTCONN,我想是因为套接字未连接。但是,从代码中可以看到,我正在跟踪select循环直到connect()成功的次数。我已经看到这个数字成千上万了。这让我担心自己正处于繁忙的等待循环中。为什么即使套接字未处于connect()成功状态,它仍可写?正在调用connect()清除该可写状态,并且由于某种原因被操作系统再次设置了该状态,还是我真的处于繁忙的等待循环中?

Calling connect() multiple times appears to have fixed my original problem, which was that I was getting ENOTCONN when calling write(), I guess because the socket was not connected. However, you can see from the code that I'm tracking how many times through the select loop until connect() succeeds. I've seen the number go into the thousands. This gets me worried that I'm in a busy wait loop. Why is the socket writable even though it's not in a state that connect() will succeed? Is calling connect() clearing that writable state and it's getting set again by the OS for some reason, or am I really in a busy wait loop?

谢谢,
尼克

推荐答案

来自 http://lxr.free-electrons.com/source/net/unix/af_unix.c

441 static int unix_writable(const struct sock *sk)
442 {
443         return sk->sk_state != TCP_LISTEN &&
444                (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
445 }

我不确定要比较的是哪些缓冲区,但是显然,未检查套接字的连接状态。因此,除非套接字连接后修改了这些缓冲区,否则我的unix套接字将始终被标记为可写,因此我无法使用select()来确定无阻塞connect()完成的时间。

I'm not sure what these buffers are that are being compared, but it looks obvious that the connected state of the socket is not being checked. So unless these buffers are modified when the socket becomes connected it would appear my unix socket will always be marked as writable and thus I can't use select() to determine when the non-blocking connect() has finished.

并基于 http://lxr.free-electrons.com/source/net/unix/af_unix.c

1206 static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
1207                                int addr_len, int flags)
.
.
.
1230         timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
.
.
.
1271         if (unix_recvq_full(other)) {
1272                 err = -EAGAIN;
1273                 if (!timeo)
1274                         goto out_unlock;
1275 
1276                 timeo = unix_wait_for_peer(other, timeo);
.
.
.

似乎设置了发送超时可能会使连接超时。它也与 http://man7.org/linux上的SO_SNDTIMEO文档相匹配/man-pages/man7/socket.7.html

it appears setting the send timeout might be capable of timing out the connect. Which also matches the documentation for SO_SNDTIMEO at http://man7.org/linux/man-pages/man7/socket.7.html.

感谢,
尼克

这篇关于连接到套接字时似乎无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆