vmsplice()和TCP [英] vmsplice() and TCP

查看：135 发布时间：2020/4/25 11:23:09 linux kernel mmap splice zero-copy

本文介绍了vmsplice()和TCP的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在原始的vmsplice()实现中，建议，如果您拥有用户土地缓冲区是管道中可以容纳的最大页面数的2倍，成功在缓冲区后半部分使用vmsplice()可以确保内核使用缓冲区的前半部分完成.

In the original vmsplice() implementation, it was suggested that if you had a user-land buffer 2x the maximum number of pages that could fit in a pipe, a successful vmsplice() on the second half of the buffer would guarantee that the kernel was done using the first half of the buffer.

但这毕竟不是事实，尤其是对于TCP而言，内核页面将一直保留到从另一端接收到ACK为止.解决此问题留在以后的工作中，因此对于TCP，内核仍然必须从管道中复制页面.

But that was not true after all, and particularly for TCP, the kernel pages would be kept until receiving ACK from the other side. Fixing this was left as future work, and thus for TCP, the kernel would still have to copy the pages from the pipe.

vmsplice() 具有SPLICE_F_GIFT选项，该选项可以解决此问题，但问题是这带来了另外两个问题-如何有效地从内核获取新页面，以及如何减少缓存浪费.第一个问题是mmap需要内核清除页面，第二个问题是尽管mmap可能会使用内核中的kscrubd 功能，可以增加进程的工作集(缓存垃圾).

vmsplice() has the SPLICE_F_GIFT option that sort-of deals with this, but the problem is that this exposes two other problems - how to efficiently get fresh pages from the kernel, and how to reduce cache trashing. The first issue is that mmap requires the kernel to clear the pages, and the second issue is that although mmap might use the fancy kscrubd feature in the kernel, that increases the working set of the process (cache trashing).

基于此，我有以下问题:

Based on this, I have these questions:

用于通知用户域有关页面的安全重用的当前状态是什么?我对splice()d到套接字(TCP)上的页面特别感兴趣.在过去5年中发生了什么事?
mmap/vmsplice/splice/munmap是当前在TCP服务器中进行零复制的最佳实践吗?还是我们今天有更好的选择?

What is the current state for notifying userland about the safe re-use of pages? I am especially interested in pages splice()d onto a socket (TCP). Did anything happen during the last 5 years?
Is mmap / vmsplice / splice / munmap the current best practice for zero-copying in a TCP server or have we better options today?

推荐答案

是的，由于TCP套接字在页面上保留了不确定的时间，因此您无法使用示例代码中提到的双缓冲方案.另外，在我的用例中，页面来自循环缓冲区，因此我无法将页面赠予内核并分配新页面.我可以确认收到的数据中有数据损坏.

Yes, due to the TCP socket holding on to the pages for an indeterminate time you cannot use the double-buffering scheme mentioned in the example code. Also, in my use case the pages come from circular buffer so I cannot gift the pages to the kernel and alloc fresh pages. I can verify that I am seeing data corruption in the received data.

我诉诸于轮询TCP套接字的发送队列的级别，直到其消耗为0.这可以修复数据损坏，但次优，因为将发送队列消耗为0会影响吞吐量.

I resorted to polling the level of the TCP socket's send queue until it drains to 0. This fixes data corruption but is suboptimal because draining the send queue to 0 affects throughput.

n = ::vmsplice(mVmsplicePipe.fd.w, &iov, 1, 0);
while (n) {
    // splice pipe to socket
    m = ::splice(mVmsplicePipe.fd.r, NULL, mFd, NULL, n, 0);
    n -= m;
}

while(1) {
    int outsize=0;
    int result;

    usleep(20000);

    result = ::ioctl(mFd, SIOCOUTQ, &outsize);
    if (result == 0) {
        LOG_NOISE("outsize %d", outsize);
    } else {
        LOG_ERR_PERROR("SIOCOUTQ");
        break;
    }
    //if (outsize <= (bufLen >> 1)) {
    if (outsize == 0) {
        LOG("outsize %d <= %u", outsize, bufLen>>1);
        break;
    }
};

这篇关于vmsplice()和TCP的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

vmsplice()和TCP [英] vmsplice() and TCP

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

vmsplice()和TCP [英] vmsplice() and TCP

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭