OpenCL计算冻结屏幕 [英] OpenCL computation freezes the screen

查看:59
本文介绍了OpenCL计算冻结屏幕的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如标题所述,当我运行OpenCL内核时,整个屏幕停止重绘(在我的程序完成计算之前,监视器上显示的图像保持不变.即使我从笔记本上拔下了它,也是如此)并将其重新插入-始终显示相同的图像),并且计算机似乎也不会对鼠标移动做出反应-光标停留在相同的位置.

As the title says, when I run my OpenCL kernel the entire screen stops redrawing (the image displayed on monitor remains the same until my program is done with calculations. This is true even in case I unplug it from my notebook and plug it back - allways the same image is displayed) and the computer does not seem to react to mouse moves either - the cursor stays in the same position.

我不确定为什么会这样.可能是我程序中的错误,还是这是标准行为?

I am not sure why this happens. Could it be a bug in my program, or is this a standard behaviour ?

在Google上搜索时,我在AMD论坛上发现了线程,那里的一些人认为这很正常因为当GPU忙于计算时,它无法刷新屏幕.

While searching on Google I found this thread on AMD's forum and some people there suggested it's normal as the GPU can't refresh the screen, when it is busy with computations.

如果这是真的,还有什么办法可以解决这个问题?

If this is true, is there still any way to work around this ?

我的内核计算可能要花费几分钟,并且使我的计算机几乎无法使用,那真是很痛苦.

My kernel computation can take up to several minutes and having my computer practically unusable for whole that time is really painful.

这是我当前的设置:

  • 图形卡是ATI Mobility Radeon HD 5650,具有512 MB的内存和AMD网站上的最新Catalyst beta驱动程序
  • 图形是可切换的-Intel集成/ATI专用卡,但是 我已禁用BIOS切换功能,否则无法获取 在Ubuntu上运行的驱动程序.
  • 操作系统是Ubuntu 12.10(64位),但是在Windows 7(64位)上也是如此.
  • 我通过HDMI插入了显示器(但是笔记本电脑的屏幕冻结了 也是如此,所以这应该不成问题)
  • graphics card is ATI Mobility Radeon HD 5650 with 512 MB of memory and latest Catalyst beta driver from AMD's website
  • the graphics is switchable - Intel integrated/ATI dedicated card, but I have disabled switching in BIOS, because otherwise I could not get the driver working on Ubuntu.
  • the operating system is Ubuntu 12.10 (64-bits), but this happens on Windows 7 (64-bits) as well.
  • I have my monitor plugged in via HDMI (but the notebook screen freezes too, so this should not be a problem)

因此,在玩完我的代码一天之后,我从您的回复中得到了建议,并将我的算法更改为以下形式(使用伪代码):

so after a day of playing with my code, I took the advices from your responses and changed my algorithm to something like this (in pseudo code):

for (cl_ulong chunk = 0; chunk < num_chunks; chunk += chunk_size)
{
  /* set kernel arguments that are different for each chunk */
  clSetKernelArg(/* ... */);

  /* schedule kernel for next execution */
  clEnqueueNDRangeKernel(cmd_queue, kernel, 1, NULL, &global_work_size, NULL, 0, NULL, NULL);

  /* read out the results from kernel and append them to output array on host */
  clEnqueueReadBuffer(cmd_queue, of_buf, CL_TRUE, 0, chunk_size, output + chunk, 0, NULL, NULL);
}

因此,现在我将整个工作负载拆分到主机上,并将其分块发送到GPU.对于每个数据块,我都会排队一个新内核,并将从中获得的结果以正确的偏移量附加到输出数组.

So now I split the whole workload on host and send it to GPU in chunks. For each chunk of data I enqueue a new kernel and the results that I get from it are appended to the output array at a correct offset.

这是您应该将计算结果除法的意思吗?

Is this how you meant that the calculation should be divided ?

这似乎是解决冻结问题的方法,甚至现在我能够处理比可用的GPU内存大得多的数据,但是我仍然必须进行一些良好的性能评估,才能看出什么是好块大小...

This seems like the way to remedy the freeze problem and even more now I am able to process data much larger than the available GPU memory, but I will yet have to make some good performance meassurements, to see what is the good chunk size...

推荐答案

每当GPU运行OpenCL内核时,它便完全专用于OpenCL.我认为某些现代Nvidia GPU是个例外,从GeForce GTX 500系列开始,如果这些内核未使用所有可用的计算单元,它们可以运行多个内核.

Whenever a GPU is running an OpenCL kernel it is completely dedicated to OpenCL. Some modern Nvidia GPUs are the exception, I think from the GeForce GTX 500 series onwards, which could run multiple kernels if those kernels did not use all available compute units.

您的解决方案是将您的计算分为多个简短的内核调用,这是最好的全方位解决方案,因为它甚至可以在单个GPU机器上运行,或者投资购买便宜的GPU来驱动显示器.

Your solutions are to divide your calculations into multiple short kernel calls, which is the best all round solution since it will work even on single GPU machines, or to invest in a cheap GPU for driving your display.

如果要在GPU上运行较长的内核,则必须禁用GPU的超时检测和恢复,或者使超时延迟长于最大内核运行时(最好仍然可以捕获错误),请参见此处该怎么做.

If you are going to run long kernels on your GPUs then you must disable timeout detection and recovery for GPUs or make the timeout delay longer than the maximum kernel runtime (better as bugs can still be caught), see here for how to do this.

这篇关于OpenCL计算冻结屏幕的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆