初始化cuda全局变量 [英] Initializing cuda global variable

查看:169
本文介绍了初始化cuda全局变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

   __constant__ const unsigned int *ff = (const unsigned int[]){90, 50, 100};


int main()
{
}

编译:

nvcc ./test.cu
./test.cu(1): error: identifier "__T20" is undefined in device code

1 error detected in the compilation of "/tmp/tmpxft_0000785f_00000000-10_test.cpp2.i".

详细编译:

 nvcc --verbose ./test.cu
    #$ _SPACE_= 
    #$ _CUDART_=cudart
    #$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin
    #$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin
    #$ _TARGET_SIZE_=
    #$ _TARGET_DIR_=
    #$ _TARGET_SIZE_=64
    #$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice
    #$ PATH=/usr/lib/nvidia-cuda-toolkit/bin:/home/kasha/bin:/home/kasha/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
    #$ LIBRARIES=  -L/usr/lib/x86_64-linux-gnu/stubs
    #$ gcc -D__CUDA_ARCH__=200 -E -x c++        -DCUDA_DOUBLE_MATH_FUNCTIONS  -D__CUDACC__ -D__NVCC__  -D"__CUDACC_VER__=70517" -D"__CUDACC_VER_BUILD__=17" -D"__CUDACC_VER_MINOR__=5" -D"__CUDACC_VER_MAJOR__=7" -include "cuda_runtime.h" -m64 "./test.cu" > "/tmp/tmpxft_0000799b_00000000-9_test.cpp1.ii" 
    #$ cudafe --allow_managed --m64 --gnu_version=50400 -tused --no_remove_unneeded_entities --gen_c_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.c" --stub_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.gpu" --nv_arch "compute_20" --gen_module_id_file --module_id_file_name "/tmp/tmpxft_0000799b_00000000-3_test.module_id" --include_file_name "tmpxft_0000799b_00000000-2_test.fatbin.c" "/tmp/tmpxft_0000799b_00000000-9_test.cpp1.ii" 
    #$ gcc -D__CUDA_ARCH__=200 -E -x c        -DCUDA_DOUBLE_MATH_FUNCTIONS  -D__CUDACC__ -D__NVCC__ -D__CUDANVVM__  -D__CUDA_PREC_DIV -D__CUDA_PREC_SQRT -m64 "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.gpu" > "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i" 
    #$ cudafe -w --allow_managed --m64 --gnu_version=50400 --c --gen_c_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.c" --stub_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.stub.c" --gen_device_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.gpu" --nv_arch "compute_20" --module_id_file_name "/tmp/tmpxft_0000799b_00000000-3_test.module_id" --include_file_name "tmpxft_0000799b_00000000-2_test.fatbin.c" "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i" 
    ./test.cu(1): error: identifier "__T20" is undefined in device code

    1 error detected in the compilation of "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i".
    # --error 0x2 --

在编译时cuda分配数组(const unsigned int [ ]){90、50、100}转换为 __ T20 变量,并将其声明为静态变量。因此无法从主文件访问它。在主文件中有: __ constant__ const unsigned * ff = __T20; 如何在cuda中用数组初始化全局指针?

During compilation cuda assign array (const unsigned int[]){90, 50, 100} to __T20 variable and declare it as static. Thus its unaccessible from the main file. In the main file there is: __constant__ const unsigned *ff = __T20; How to initialize global pointer with array in cuda?

推荐答案

编译器会告诉您确切的错误是什么。这样做时:

The compiler is telling you exactly what the error is. When you do this:

__constant__ const unsigned int *ff = (const unsigned int[]){90, 50, 100};

您正在尝试将匿名主机数组的地址静态分配给设备符号。显然,这没有任何意义。即使可以编译,分配给 ff 的地址也将无效,因为它位于主机内存中。

you are trying to statically assign the address of an anonymous host array to a device symbol. Obviously that makes no sense; even if it would compile, the address assigned to ff would be invalid because it would be in host memory.

据我所知,在初始化静态声明的全局设备符号时,无法在设备内存中声明和使用匿名对象。

To the best of my knowledge, there is no way of declaring and using anonymous objects in device memory in initialisation of statically declared global device symbols.

您可以执行以下操作:

__device__ const unsigned int ee[3] = {90, 50, 100};
__constant__ const unsigned int *ff = &ee[0];


int main()
{
}

,以便使用地址进行静态分配,编译器可以将其明确标识为位于设备内存中。请注意,常量内存的预期缓存属性仅适用于指针值,而不适用于其指向的内存,因此,我想过,您尝试执行的操作的用例非常有限。

so that the static assignment made with an address which the compiler explicitly can identify as being in device memory. Note that the expected caching properties of constant memory only apply to the pointer value and not the memory it points to, so the use case for what you are trying to do is pretty limited, I would have thought.

这篇关于初始化cuda全局变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆