gpgpu上的opencl中的SIMD-8,SIMD-16或SIMD-32 [英] SIMD-8,SIMD-16 or SIMD-32 in opencl on gpgpu
问题描述
针对该主题(SIMD模式),我阅读了关于SO的几个问题,但仍需要对事情的工作方式进行一些澄清/确认.
如果我在SIMD-8模式下编译代码,以下几点正确吗? 1)这意味着不同工作项的8条指令正在并行执行.
2)这是否意味着所有工作项都仅在执行同一条指令?
3)如果每个工作代码都包含vload16加载,则仅进行float16操作,然后进行vstore16操作. SIMD-8模式仍将起作用.我的意思是说,GPU是否真正为所有8个工作项执行相同的指令(要么是vload16/float16/vstore16)?
我应该如何理解这个概念?
在过去,许多OpenCL供应商要求使用向量类型才能使用SIMD.如今,OpenCL供应商将工作项打包到SIMD中,因此无需使用向量类型.可以通过查询CL_DEVICE_PREFERRED_VECTOR_WIDTH_<CHAR, SHORT, INT, LONG, FLOAT, DOUBLE>
来检查是否喜欢使用向量类型.
在Intel上,如果使用矢量类型,则矢量化器首先将它们标量化,然后重新矢量化以利用宽指令集.在其他平台上可能也是如此.
I read couple of questions on SO for this topic(SIMD Mode), but still slight clarification/confirmation of how things work is required.
Why use SIMD if we have GPGPU?
SIMD intrinsics - are they usable on gpus?
Are following points correct,if I compile the code in SIMD-8 mode ? 1) it means 8 instructions of different work items are getting executing in parallel.
2) Does it mean All work items are executing the same instruction only?
3) if each wrok item code contains vload16 load then float16 operations and then vstore16 operations only. SIMD-8 mode will still work. I mean to say is it true GPU is till executing the same instruction (either vload16/ float16 / vstore16) for all 8 work items?
How should I understand this concept?
In the past many OpenCL vendors required to use vector types to be able to use SIMD. Nowadays OpenCL vendors are packing work items into SIMD so there is no need to use vector types. Whether is preffered to use vector types can be checked by querying for: CL_DEVICE_PREFERRED_VECTOR_WIDTH_<CHAR, SHORT, INT, LONG, FLOAT, DOUBLE>
.
On Intel if vector type is used the vectorizer first scalarize them and then re-vectorize to make use of the wide instruction set. This is probably going to be similar on the other platforms.
这篇关于gpgpu上的opencl中的SIMD-8,SIMD-16或SIMD-32的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!