OpenCL工作项是否并行执行? [英] Are OpenCL work items executed in parallel?

查看:159
本文介绍了OpenCL工作项是否并行执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道工作项已分组到工作组中,并且您无法在工作组之外进行同步.

I know that work items are grouped into the work groups, and you cannot synchronize outside of a work group.

这是否意味着工作项是并行执行的?

Does it mean that work items are executed in parallel?

如果是这样,是否可以/有效地将1个工作组包含128个工作项?

If so, is it possible/efficient to make 1 work group with 128 work items?

推荐答案

组中的工作项将一起安排,并且可以一起运行.由硬件和/或驱动程序来选择实际执行的并行度.造成这种情况的原因多种多样,但是其中一个很好的原因是隐藏内存延迟.

The work items within a group will be scheduled together, and may run together. It is up to the hardware and/or drivers to choose how parallel the execution actually is. There are different reasons for this, but one very good one is to hide memory latency.

在我的AMD卡上,计算单元"分为16个4宽SIMD单元.这意味着从技术上讲,该组中可以同时运行16个工作项.建议我们在一个组中使用64个工作项的倍数,以隐藏内存延迟.显然,它们不能全部在准确的时间运行.这不是问题,因为实际上大多数内核都受内存限制,因此调度程序(硬件)将交换等待内存控制器的工作项,而就绪"项则获得其计算时间.组中工作项的实际数量由主机程序设置,并受CL_DEVICE_MAX_WORK_GROUP_SIZE的限制.您将需要针对内核尝试最佳的工作组大小.

On my AMD card, the 'compute units' are divided into 16 4-wide SIMD units. This means that 16 work items can technically be run at the same time in the group. It is recommended that we use multiples of 64 work items in a group, to hide memory latency. Clearly they cannot all be run at the exact time. This is not a problem, because most kernels are in fact, memory bound, so the scheduler (hardware) will swap the work items waiting on the memory controller out, while the 'ready' items get their compute time. The actual number of work items in the group is set by the host program, and limited by CL_DEVICE_MAX_WORK_GROUP_SIZE. You will need to experiment with the optimal work group size for your kernel.

当涉及到同时工作项目时,cpu的实现是更糟糕的".只要有可用于运行它们的核心,那么运行的工作项目就永远不计其数.它们在CPU中表现得更顺畅.

The cpu implementation is 'worse' when it comes to simultaneous work items. There are only ever as many work items running as you have cores available to run them on. They behave more sequentially in the cpu.

那么工作项是否同时运行?几乎从未如此.这就是为什么我们要确保障碍物在给定点处暂停时需要使用障碍物的原因.

So do work items run at the exactly same time? Almost never really. This is why we need to use barriers when we want to be sure they pause at a given point.

这篇关于OpenCL工作项是否并行执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆