OpenCL:有关SIMT执行模型的基本问题 [英] OpenCL: basic questions about SIMT execution model

查看:85
本文介绍了OpenCL:有关SIMT执行模型的基本问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

"SIMT"体系结构的某些概念和设计对我来说仍然不清楚.

Some of the concepts and designs of the "SIMT" architecture are still unclear to me.

从我所看到和阅读的内容来看,使代码路径和if()完全分开是一个很糟糕的主意,因为许多线程可能会以锁步的方式执行.现在那到底是什么意思?怎么样呢?

From what I've seen and read, diverging code paths and if() altogether are a rather bad idea, because many threads might execute in lockstep. Now what does that exactly mean? What about something like:

kernel void foo(..., int flag)
{
    if (flag)
        DO_STUFF
    else
        DO_SOMETHING_ELSE
}

所有工作单位的参数标志"均相同,并且所有工作单位均采用相同的分支.现在,GPU是否将执行所有代码,仍然对所有内容进行序列化,并且基本上仍采用未使用的分支?还是更聪明一点,只要所有线程都同意所采用的分支,它只会执行所采用的分支?在这里总是如此.

The parameter "flag" is the same for all work units and the same branch is taken for all work units. Now, is a GPU going to execute all of the code, serializing everything nonetheless and basically still taking the branch that is not taken? Or is it a bit more clever and will only execute the branch taken, as long as all threads agree on the branch taken? Which would always be the case here.

即序列化总是发生还是仅在需要时发生?很抱歉这个愚蠢的问题. ;)

I.e. does serialization ALWAYS happen or only if needed? Sorry for the stupid question. ;)

推荐答案

不,并非总是如此.仅当条件在本地工作组中的线程之间不一致时才执行两个分支,这意味着如果条件对本地工作组中的工作项之间的值求值不同,则当前一代GPU将执行两个分支,但仅执行正确的分支将写入值并具有副作用.

No, doesn´t happen always. Executing both branches happens only if the condition is not coherent between threads in a local work group, that means if the condition evaluates to different values between work items in a local work group, current generation GPUs will execute both branches, but only the correct branches will write values and have side effects.

因此,保持一致性对于GPU分支的性能至关重要.

So, maintaining coherency is vital to performance in GPU branches.

这篇关于OpenCL:有关SIMT执行模型的基本问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆