CUDA:在内核中调用库函数 [英] CUDA: calling library function in kernel

查看:990
本文介绍了CUDA:在内核中调用库函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道在内核中只能调用 __ device __ 函数的限制。这阻止我调用内核中的标准函数,例如 strcmp()等等。

此时,我无法理解/找到原因。在内核中对 strcmp()的调用内联时,编译器是否只能跟踪每个包含在strings.h等中?我想我的原因是很容易,我错过了这里。

是否是重新实现所有的函数和数据类型我需要在内核计算的唯一方法?是否有一个有这样的重新实现的代码库?

I know that there is the restriction to call only __device__ functions in the kernel. This prevents me from calling standard functions like strcmp() and so on in the kernel.
At this point I am not able to understand/find the reasons for this. Could not the compiler just follow each includes in strings.h and so on while inlining the calls to strcmp() in the kernel? I guess the reason I am looking for is easy and I am missing something here.
Is it the only way to reimplement all the functions and datatypes I need in kernel computation? Is there a codebase with such reimplementations?

推荐答案

是的,从内核使用stdlib的函数的唯一方法是重新实现它们。但是我强烈建议您重新考虑这个想法,因为它不太可能需要在GPU上运行使用 strcmp()的代码。请添加有关您的问题的更多详细信息,以便提出更好的解决方案(我非常怀疑GPU上的串行字符串比较是您真正需要的)。

Yes, the only way to use stdlib's functions from kernel is to reimplement them. But I strongly advice you to reconsider this idea, since it's highly unlikely you would need to run code that uses strcmp() on GPU. Please, add additional details about your problem, so a better solution could be proposed (I highly doubt that serial string comparison on GPU is what you really need).

可以简单地重新编译所有的stdlib GPU,因为它依赖于一些系统调用(如内存分配),这不能在GPU上使用(以及,在最近的CUDA工具包版本你可以分配来自内核的设备内存,但它不是cuda-way,仅由最新的硬件支持,并且非常性能不好)。
此外,大多数功能的CPU版本对于GPU来说还不是很好。所以,在绝大多数情况下,编译你的普通CPU功能的GPU将导致没有好处,所以编译器甚至不尝试它。

It's barely possible to simply recompile all stdlib for GPU, since it depends a lot on some system calls (like memory allocation), which could not be used on GPU (well, in recent versions of CUDA toolkit you can allocate device memory from kernel, but it's not "cuda-way", is supported only by newest hardware and is very bad for performance). Besides, CPU versions of most functions is far from being "good" for GPUs. So, in vast majority of cases compiling your ordinary CPU functions for GPU would lead to no good, so the compiler doesn't even try it.

这篇关于CUDA:在内核中调用库函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆