为什么定义类头,没有CUDA __device__属性工程? (C ++) [英] Why defining class headers without CUDA __device__ attribute works? (C++)
问题描述
我有一个带有以下声明的.h文件:
I have a .h file with the following declarations:
class Foo{
public:
inline int getInt();
};
且我的.cu档定义如下:
and my .cu file defines the following:
__device__ int Foo::getInt(){
return 42;
}
这真的很棒,因为虽然我不能调用 getInt
从主机,我可以包括.h文件在.cpp文件,所以我有类型声明可见的主机。但对我来说,它似乎不工作,所以为什么我不需要把 __ device __
属性在.h文件?
This is pretty awesome, because althought I cannot actually call getInt
from host, I can include the .h file in .cpp files so I have the type declaration visible for the host. But for me it doesn't seem it should work, so why I dont need to put the __device__
attribute on the .h file?
推荐答案
如果它工作,它不应该。这可能是CUDA编译器中的一个错误,它可能会在将来得到修复 - 所以不要依赖它。
If it works, it should not. It is probably a bug in a CUDA compiler and it might get fixed in the future - so do not rely on it.
但是,如果你想让类可见对于主机(和非cuda编译器),但你有一些 __ device __
功能,你不需要在主机上,你总是可以封装这些函数与 #ifdef __CUDACC __
- #endif
。 __ CUDACC __
在使用nvcc进行编译时是预定义的,否则不是。所以你可以写在你的头像:
However, if you want the class to be visible for the host (and non-cuda compiler), but you have some __device__
functionality which you don't need on the host, you can always encapsulate those functions with the #ifdef __CUDACC__
-- #endif
. The __CUDACC__
is predefined when compiling with nvcc, otherwise it is not. So you can write in your header something like:
class Foo{
public:
#ifdef __CUDACC__
inline __device__ int getInt();
#endif
};
如果你害怕有太多的预处理器ifdefs,你也可以做一个小技巧: / p>
If you are afraid of having too many preprocessor ifdefs, you can also do a trick as follows:
#ifdef __CUDACC__
#define HOST __host__
#define DEVICE __device__
#else
#define HOST
#define DEVICE
#endif
...
class Foo{
public:
inline HOST DEVICE int getInt();
};
这篇关于为什么定义类头,没有CUDA __device__属性工程? (C ++)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!