OPENCL:确定最佳local_item_size [英] Opencl: Determine the best local_item_size

查看:639
本文介绍了OPENCL:确定最佳local_item_size的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的code的作用就像2D矩阵muliplication(的 http://gpgpu-computing4.blogspot.de/2009/09/matrix-multiplication-2-opencl.html )。
该矩阵的dimenstions是(1000 * 1000和10000 * 10000和100000 * 100000)。

我的硬件是:NVIDIA公司GM204 [的GeForce GTX 980(MAX_WORK_GROUP_SIZES:1024 1024 64)。

现在的问题是:

什么是最好的local_item_size我可以使用?


 为size_t local_item_size [2],global_item_size [2];
global_item_size [0] = NUM​​BER_OF_POINTS;
global_item_size [1] = NUM​​BER_OF_POINTS;
local_item_size [0] = 10;
local_item_size [1] = 10;


由于提前,


解决方案
在NVIDIA显卡

您应在一个工作组使用32乘占总线(所以8 * 8应该没问题)。全局工作大小必须在每个维度的本地工作大小的倍数,因此它必须被修改,以及

这可能需要在内核code一些修改过,以处理外的范围内的值(可能有多个工作项目,比数据)。

请注意,如果你没有指定本地工作组大小(例如空传进去),驱动程序会自动选择它。它不能保证它选择最佳的大小,但它是值得一试。

My code acts like 2d matrix muliplication ( http://gpgpu-computing4.blogspot.de/2009/09/matrix-multiplication-2-opencl.html). The dimenstions of the matrixes are (1000*1000 and 10000*10000 and 100000*100000).

My Hardware is: NVIDIA Corporation GM204 [GeForce GTX 980] (MAX_WORK_GROUP_SIZES: 1024 1024 64).

The question is:

What is the best local_item_size can I use?

size_t local_item_size[2], global_item_size[2];
global_item_size[0] = number_of_points; 
global_item_size[1] = number_of_points; 
local_item_size[0] = 10; 
local_item_size[1] = 10;

Thanks in advance,

解决方案

on nvidia cards you should use multiplies of 32 as total threads in a workgroup (so 8*8 should be ok). Global work size must be a multiple of the local work size in each dimension, so it must be modified as well.

This may need some modification in the kernel code too, to handle out-of-range values (there may be more work items, than data).

Note that if you don't specify the local workgroup size (e.g. pass null into it), the driver will choose it automatically. It's not guaranteed that it picks the best size, but it's worth trying.

这篇关于OPENCL:确定最佳local_item_size的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆