OpenCL-如何查询设备的SIMD宽度? [英] OpenCL - How to I query for a device's SIMD width?
问题描述
在CUDA中,有一个 warp 的概念,它被定义为可以在单个处理元素中同时执行同一条指令的最大线程数.对于NVIDIA,目前市场上所有卡的翘曲尺寸均为32.
In CUDA, there is a concept of a warp, which is defined as the maximum number of threads that can execute the same instruction simultaneously within a single processing element. For NVIDIA, this warp size is 32 for all of their cards currently on the market.
在ATI卡中,有一个类似的概念,但是在这种情况下,术语是 wavefront .经过一番摸索后,我发现我拥有的ATI卡的波前大小为64.
In ATI cards, there is a similar concept, but the terminology in this context is wavefront. After some hunting around, I found out that the ATI card I have has a wavefront size of 64.
我的问题是,在运行时如何为OpenCL查询此SIMD宽度?
My question is, what can I do to query for this SIMD width at runtime for OpenCL?
推荐答案
我找到了想要的答案.事实证明,您无需在设备上查询此信息,而可以查询内核对象(在OpenCL中).我的来源是:
I found the answer I was looking for. It turns out that you don't query the device for this information, you query the kernel object (in OpenCL). My source is:
http://www. hpc.lsu.edu/training/tutorials/sc10/tutorials/SC10Tutorials/docs/M13/M13.pdf
(第108页)
其中说:
最有效的工作组大小可能是本地硬件执行宽度的倍数
The most efficient work group sizes are likely to be multiples of the native hardware execution width
- AMD发言中的波前尺寸/Nvidia发言中的经线尺寸
- CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE的查询设备
因此,简而言之,答案似乎是调用参数名称为CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE的clGetKernelWorkGroupInfo()方法.有关此方法的更多信息,请参见此链接:
So, in short, the answer appears to be to call the clGetKernelWorkGroupInfo() method with a param name of CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. See this link for more information on this method:
http://www.khronos. org/registry/cl/sdk/1.1/docs/man/xhtml/clGetKernelWorkGroupInfo.html
这篇关于OpenCL-如何查询设备的SIMD宽度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!