OSX 和 Windows 10 上的低延迟同步输出 [英] Low latency isochronous out on OSX and Windows 10

查看:58
本文介绍了OSX 和 Windows 10 上的低延迟同步输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过高速 USB 2 以极低的延迟输出同步数据(以编程方式生成).理想情况下大约 1-2 毫秒.在 Windows 上我使用的是 WinUsb,在 OSX 上我使用的是 IOKit.

I'm trying to output isochronous data (generated programmatically) over High Speed USB 2 with very low latency. Ideally around 1-2 ms. On Windows I'm using WinUsb, and on OSX I'm using IOKit.

我想到了两种方法.我想知道哪个最好.

There are two approaches I have thought of. I'm wondering which is best.

WinUsb 允许的范围非常有限,并且要求每次同步传输都是整数帧(1 帧 = 1 毫秒).因此,为了最大限度地减少延迟,请在循环中使用每一帧的传输,如下所示:

WinUsb is quite restrictive in what it allows, and requires each isochronous transfer to be a whole number of frames (1 frame = 1 ms). Therefore to minimise latency use transfers of one frame each in a loop something like this:

for (;;)
{
    // Submit a 1-frame transfer ASAP.
    WinUsb_WriteIsochPipeAsap(..., &overlapped[i]);

    // Wait for the transfer from 2 frames ago to complete, for timing purposes. This
    // keeps the loop in sync with the USB frames.
    WinUsb_GetOverlappedResult(..., &overlapped[i-2], block=true);
}

这工作得相当好,延迟为 2 毫秒.在 OSX 上,我可以做类似的事情,尽管它要复杂得多.这是代码的要点 - 完整代码太长,无法在此处发布:

This works fairly well and gives a latency of 2 ms. On OSX I can do a similar thing, though it is quite a bit more complicated. This is the gist of the code - the full code is too long to post here:

uint64_t frame = ...->GetBusFrameNumber(...) + 1;
for (;;)
{
    // Submit at the next available frame.
    for (a few attempts)
    {
        kr = ...->LowLatencyWriteIsochPipeAsync(...
                                            frame, // Start on this frame.
                                            &transfer[i]); // Callback
        if (kr == kIOReturnIsoTooOld)
            frame++; // Try the next frame.
        else if (kr == kIOReturnSuccess)
            break;
        else
            abort();
    }

    // Above, I pass a callback with a reference to a condition_variable. When
    // the transfer completes the condition_variable is triggered and wakes this up:
    transfer[i-5].waitForResult();

    // I have to wait for 5 frames ago on OSX, otherwise it skips frames.
}

同样的工作,延迟约为 3.5 毫秒.但它不是超级可靠.

Again this kind of works and gives a latency of around 3.5 ms. But it's not super-reliable.

OSX 的低延迟同步函数允许您提交长传输(例如 64 帧),然后定期(最多每毫秒一次)更新帧列表,该列表说明内核在读取写入缓冲区时已到达的位置.

OSX's low latency isochronous functions allow you to submit long transfers (e.g. 64 frames), and then regularly (max once per millisecond) update the frame list which says where the kernel has got to in reading the write buffer.

我认为这个想法是以某种方式每 N 毫秒(或微秒)唤醒一次,读取帧列表,找出需要写入的位置并执行此操作.我还没有为此编写代码,但我不完全确定如何继续,而且我找不到任何示例.

I think the idea is that you somehow wake up every N milliseconds (or microseconds), read the frame list, work out where you need to write to and do that. I haven't written code for this yet but I'm not entirely sure how to proceed, and there are no examples I can find.

当帧列表更新时它似乎没有提供回调,所以我想你必须使用自己的计时器 - CFRunLoopTimerCreate() 并从回调中读取帧列表?

It doesn't seem to provide a callback when the frame list is updated so I suppose you have to use your own timer - CFRunLoopTimerCreate() and read the frame list from that callback?

另外,我想知道 WinUsb 是否允许类似的事情,因为它还强制您注册一个缓冲区,以便内核和用户空间可以同时访问它.我找不到任何明确说明您可以在内核读取缓冲区时写入缓冲区的示例.您打算使用 WinUsb_GetCurrentFrameNumber 在常规回调中确定内核在传输中到达的位置?

Also I'm wondering if WinUsb allows a similar thing, because it also forces you to register a buffer so it can be simultaneously accessed by the kernel and user-space. I can't find any examples that explicitly say you can write to the buffer while the kernel is reading it though. Are you meant to use WinUsb_GetCurrentFrameNumber in a regular callback to work out where the kernel has got to in a transfer?

这需要在 Windows 上获得定期回调,这似乎有点棘手.我见过的唯一方法是使用 多媒体计时器 最短周期为 1 毫秒(除非您使用未记录的 (NtSetTimerResolution?).

That would require getting a regular callback on Windows, which seems a bit tricky. The only way I've seen is to use multimedia timers which have a minimum period of 1 millisecond (unless you use the undocumented (NtSetTimerResolution?).

所以我的问题是:我可以改进1 帧传输"方法,还是应该切换到尝试与内核竞争的 1 kHz 回调.非常感谢示例代码!

So my question is: Can I improve the "1-frame transfers" approach, or should I switch to a 1 kHz callback that tries to race the kernel. Example code very appreciated!

推荐答案

(评论太长,所以……)

(Too long for a comment, so…)

我只能解决 OS X 方面的问题.这部分问题:

I can only address the OS X side of things. This part of the question:

我认为这个想法是你以某种方式每 N 毫秒唤醒一次(或微秒),读取帧列表,找出需要写入的位置这样做.我还没有为此编写代码,但我不是完全确定如何进行,我找不到任何示例.

I think the idea is that you somehow wake up every N milliseconds (or microseconds), read the frame list, work out where you need to write to and do that. I haven't written code for this yet but I'm not entirely sure how to proceed, and there are no examples I can find.

更新帧列表时似乎没有提供回调所以我想你必须使用自己的计时器 - CFRunLoopTimerCreate()并从该回调中读取帧列表?

It doesn't seem to provide a callback when the frame list is updated so I suppose you have to use your own timer - CFRunLoopTimerCreate() and read the frame list from that callback?

让我为你想要做什么而挠头.您的数据来自哪里,延迟很关键,但数据源在数据准备就绪时尚未通知您?

Has me scratching my head over what you're trying to do. Where is your data coming from, where latency is critical but the data source does not already notify you when data is ready?

这个想法是,您的数据是从某个源流式传输的,一旦有任何数据可用,大概是在调用该数据源的某些完成时,您将所有可用数据写入用户/内核共享数据缓冲区合适的位置.

The idea is that your data is being streamed from some source, and as soon as any data becomes available, presumably when some completion for that data source gets called, you write all available data into the user/kernel shared data buffer at the appropriate location.

所以也许你可以更详细地解释一下你想要做什么,我也许可以提供帮助.

So maybe you could explain in a little more detail what you're trying to do and I might be able to help.

这篇关于OSX 和 Windows 10 上的低延迟同步输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆