同一多处理器上并发的唯一内核? [英] Concurrent, unique kernels on the same multiprocessor?

查看:118
本文介绍了同一多处理器上并发的唯一内核?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Kepler 3.5 GPU中,是否可以使用流在同一个流式多处理器上具有多个唯一内核? IE.在带有15个SM的开普勒GPU上同时运行大小为<<<1,1024>>>的30个内核?

Is it possible, using streams, to have multiple unique kernels on the same streaming multiprocessor in Kepler 3.5 GPUs? I.e. run 30 kernels of size <<<1,1024>>> at the same time on a Kepler GPU with 15 SMs?

推荐答案

在具有3.5计算能力的设备上,这可能是可能的.

On a compute capability 3.5 device, it might be possible.

这些设备每个GPU支持多达32个并发内核,并具有2048个线程对等多处理器.如果每个多处理器有64k寄存器,那么如果每个线程的寄存器占用空间小于16个线程,并且每个块小于24kb共享内存,则两个1024个线程的块可以同时运行.

Those devices support up to 32 concurrent kernels per GPU and 2048 threads peer multi-processor. With 64k registers per multi-processor, two blocks of 1024 threads could run concurrently if their register footprint was less than 16 per thread, and less than 24kb shared memory per block.

您可以在CUDA编程指南的附录中找到所有这些硬件说明.

You can find all of this is the hardware description found in the appendices of the CUDA programming guide.

这篇关于同一多处理器上并发的唯一内核?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆