我应该尝试使用尽可能多的队列吗? [英] Should I try to use as many queues as possible?

查看:47
本文介绍了我应该尝试使用尽可能多的队列吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的机器上,我有两个队列系列,一个支持一切,一个只支持传输.

On my machine I have two queue families, one that supports everything and one that only supports transfer.

支持一切的队列系列的 queueCount 为 16.

The queue family that supports everything has a queueCount of 16.

现在规范状态

提交到不同队列的命令缓冲区可能会并行执行,甚至可能彼此乱序执行

Command buffers submitted to different queues may execute in parallel or even out of order with respect to one another

这是否意味着我应该尝试使用所有可用队列以获得最大性能?

Does that mean I should try to use all available queues for maximal performance?

推荐答案

是的,如果您有高度独立的工作负载,请使用单独的队列.

Yes, if you have workload that is highly independent use separate queues.

如果队列需要在它们之间进行大量同步,它可能会扼杀您可能获得的任何潜在好处.

If the queues need a lot of synchronization between themselves, it may kill any potential benefit you may get.

基本上,您正在做的是在相同队列系列的情况下为 GPU 提供一些它可以做的替代工作(并填充停顿、气泡和空闲,并为 GPU 提供选择).还有一些可以更好地使用 CPU 的潜力(例如,单线程 vs 每个线程一个队列).

Basically what you are doing is supplying GPU with some alternative work it can do (and fill stalls and bubbles and idles with and giving GPU the choice) in the case of same queue family. And there is some potential to better use CPU (e.g. singlethreaded vs one queue per thread).

使用单独的传输队列(或其他专业系列)似乎是推荐的方法.

Using separate transfer queues (or other specialized family) seem to be the recommended approach even.

一般来说.SW 和 NB 的回答已经提出了更现实、经验、怀疑和实用的观点.实际上,由于这些队列针对相同的资源,具有相同的限制和其他常见限制,因此必须更加谨慎,从而限制了从中获得的潜在好处.值得注意的是,如果驱动程序对多个队列做了错误的事情,那么缓存可能会非常糟糕.

That is generally speaking. More realistic, empirical, sceptical and practical view was already presented by SW and NB answers. In reality one does have to be bit more cautious as those queues target the same resources, have same limits, and other common restrictions, limiting potential benefits gained from this. Notably, if the driver does the wrong thing with multiple queues, it may be very very bad for cache.

该 AMD 的 利用异步队列进行并发执行(2016 年)讨论了如何它映射到他们的硬件\驱动程序.它显示了使用单独队列系列的潜在好处.它说,虽然他们提供了两个计算系列队列,但当时他们没有观察到应用程序的好处.他们说他们只有一个图形队列,以及为什么.

This AMD's Leveraging asynchronous queues for concurrent execution(2016) discusses a bit how it maps to their HW\driver. It shows potential benefits of using separate queue families. It says that although they offer two queues of compute family, they did not observe benefits in apps at that time. They say they have only one graphics queue, and why.

NVIDIA 似乎对异步计算"有类似的想法.显示在 转向 Vulkan: 异步计算.

NVIDIA seems to have a similar idea of "asynch compute". Shown in Moving to Vulkan: Asynchronous compute.

为了安全起见,尽管在当前的硬件上,我们似乎仍然应该只使用一个图形和一个异步计算队列.16 个队列似乎是一个陷阱,也是一种伤害自己的方式.

To be safe, it seems we should still stick with only one graphics, and one async compute queue though on current HW. 16 queues seem like a trap and a way to hurt yourself.

对于传输队列,它也并不像看起来那么简单.您应该使用专用的主机-> 设备传输.并且非专用应该用于设备->设备传输操作.

With transfer queues it is not as simple as it seems either. You should use the dedicated ones for Host->Device transfers. And the non-dedicated should be used for device->device transfer ops.

这篇关于我应该尝试使用尽可能多的队列吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆