如何运行“主机”使用CUDA的GPU上的功能? [英] How to run "host" functions on GPU with CUDA?

查看:269
本文介绍了如何运行“主机”使用CUDA的GPU上的功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要在GPU上运行例如 strcmp 函数,但我得到:

I'm going to run on GPU for example a strcmp function, but I get:

error: calling a host function("strcmp") from a __device__/__global__ function("myKernel") is not allowed

这可能是 printf 将无法工作,因为gpu没有stdout,但像strcmp的功能预计工作!所以,我应该在我的代码中插入 strcmp 从库中的实现 __ device __ 前缀或什么?

It's possible that printf won't work because gpu hasn't got stdout, but functions like strcmp are expected to work! So, I should insert in my code the implement of strcmp from the library with __device__ prefix or what?

推荐答案

CUDA有一个标准库,记录在CUDA编程指南中。它包括支持它的设备的printf()(Compute Capability 2.0和更高版本),以及assert()。但是,此时它不包括完整的字符串或stdio库。

CUDA has a standard library, documented in the CUDA programming guide. It includes printf() for devices that support it (Compute Capability 2.0 and higher), as well as assert(). It does not include a complete string or stdio library at this point, however.

实现您自己的标准库Jason R. Mick建议可能,但它不是必然可取。在某些情况下,将顺序标准库中的函数端口到CUDA可能是不安全的 - 最重要的是因为这些实现中的一些不是线程安全的(例如,在Windows上的rand())。即使它是安全的,它可能不是有效的 - 它可能不是真正的你需要的。

Implementing your own standard library as Jason R. Mick suggests may be possible, but it is not necessarily advisable. In some cases, it may be unsafe to naively port functions from the sequential standard library to CUDA -- not least because some of these implementations are not meant to be thread safe (rand() on Windows, for example). Even if it is safe, it might not be efficient -- and it might not really be what you need.

在我看来,你最好避免标准库函数在CUDA没有官方支持。如果你需要你的并行代码中的标准库函数的行为,首先要考虑你是否真的需要它:
*你真的要并行做成千上万的strcmp操作吗?
*如果没有,你有比较的字符串是几千个字符吗?如果是这样,请考虑使用并行字符串比较算法。

In my opinion, you are better off avoiding standard library functions in CUDA that are not officially supported. If you need the behavior of a standard library function in your parallel code, first consider whether you really need it: * Are you really going to do thousands of strcmp operations in parallel? * If not, do you have strings to compare that are many thousands of characters long? If so, consider a parallel string comparison algorithm instead.

如果确定您确实需要并行CUDA代码中标准库函数的行为,您可以并行实现它(安全有效)。

If you determine that you really do need the behavior of the standard library function in your parallel CUDA code, then consider how you might implement it (safely and efficiently) in parallel.

这篇关于如何运行“主机”使用CUDA的GPU上的功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆