并发内核执行 [英] concurrent kernel execution

查看:173
本文介绍了并发内核执行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以从(主机)应用程序的不同线程启动内核,并让它们在同一GPGPU设备上并发运行?

Is it possible to launch kernels from different threads of a (host) application and have them run concurrently on the same GPGPU device? If not, do you know of any plans (of Nvidia) to provide this capability in the future?

推荐答案

编程指南

3.2.7.3并发内核执行
一些计算能力2.0可以同时执行多个内核。应用程序可以通过调用cudaGetDeviceProperties()并检查concurrentKernels属性来查询此功能。
设备可以同时执行的内核启动的最大数量为16个。

3.2.7.3 Concurrent Kernel Execution Some devices of compute capability 2.0 can execute multiple kernels concurrently. Applications may query this capability by calling cudaGetDeviceProperties() and checking the concurrentKernels property. The maximum number of kernel launches that a device can execute concurrently is sixteen.

所以答案是:这取决于。它实际上只依赖于设备。主机线程不会以任何方式产生影响。如果设备不支持并发内核执行,并发内核启动会被串行化,并且如果设备启动,则会同时执行不同流上的串行内核启动。

So the answer is: It depends. It actually depends only on the device. Host threads won't make a difference in any way. Concurrent kernel launches are serialized if the device doesn't support concurrent kernel execution and if the device does, serial kernel launches on different streams are executed concurrently.

这篇关于并发内核执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆