使用cmake配置vs2013的cuda项目,并获取“无效设备功能”错误 [英] Use cmake to configure cuda project for vs2013 and get "invalid device function" error
问题描述
我使用cmake gui工具在vs2013中配置我的cuda项目。
CMakeLists.txt如下:
项目(CUDA_PART)
需要cmake版本
cmake_minimum_required(VERSION 3.0)
include_directories($ {CUDA_PART_SOURCE_DIR} / common)
#packages
find_package(CUDA REQUIRED)
#nvcc flags
set(CUDA_NVCC_FLAGS -gencode arch = compute_20,code = sm_20; -G; -g)
set(CUDA_VERBOSE_BUILD ON)
#FILE(GLOB SOURCES* .cu * .cpp* .c* .h)
CUDA_ADD_EXECUTABLE(CUDA_PART hist_gpu_shmem_atomics.cu)
.cu文件来自 源代码 源代码
CUDA_NVCC_FLAGES原来是 -gencode = arch = compute_20,code = \sm_20,compute_20 \\ \\
它等于:
-gencode = arch = compute_20,code = sm_20 \
-gencode = arch = compute_20,code = compute_20
它将生成2个版本的机器代码:第一个(SASS)与虚拟和真实的架构,第二个(PTX)与只有虚拟架构。由于我的GTX960是一个cc5.2设备,它选择第二个(PTX)并将其转换为合适的SASS。
这是一个问题:
set(CUDA_NVCC_FLAGS -gencode arch = compute_20,code = sm_20; -G; -g)
这些标志将导致nvcc仅为cc 2.0设备生成SASS代码。这样的cc2.0 SASS代码不会在您的cc5.2设备(GTX960)上运行。 无效的设备功能正是在这种情况下尝试启动内核时会遇到的错误。由于内核永远不会启动,尝试在设备代码中打断点将不起作用。
我不是CMake的专家,所以可能有其他更合理的方法,但尝试修复此问题的一种可能的方法可能是:
set(CUDA_NVCC_FLAGS -gencode arch = compute_52,code = sm_52 ; -G; -g)
这将生成cc5.2设备的代码。这里有无疑是其他可能的设置,你可能想阅读这或 另请注意, I use the cmake gui tool to configure my cuda project in vs2013.
CMakeLists.txt is as below: The .cu file is from Cuda by example source code hist_gpu_shmem_atomics.cu There are two problems: After the line When I use the CUDA debugging tool to debug, its cannot trigger breakpoints in the device code. But when I create a project with the same code by the cuda project temple in visual studio 2013.It works correctly! So, is there something wrong in the CMakeLists.txt ? OS: Win7 64bit;GPU: GTX960;CUDA: CUDA 7.5;VS: 2013 (and also 2010) When I use set the "Code Generation" in vs2013 as follow :
The CUDA_NVCC_FLAGES turns out to be So, I guess it will generate 2 versions machine code: the first one(SASS) with virtual and real architectures and the second one(PTX) with only virtual architecture. Since my GTX960 is a cc5.2 device, it chooses the second one (PTX) and convert it to a suitable SASS. This is a problem: Those flags will cause nvcc to generate SASS code (only) for a cc 2.0 device (only). Such cc2.0 SASS code will not run on your cc5.2 device (GTX960). "Invalid device function" is exactly the error you would get when trying to launch a kernel in such a scenario. Since the kernel will never launch, trying to hit breakpoints in device code won't work. I'm not a CMake expert, so there might be other, more sensible approaches, but one possible way to try to fix this might be: which should generate code for your cc5.2 device. There are undoubtedly other possible settings here, you may want to read this or the nvcc manual for more background on compile options to target specific devices. Also note that 这篇关于使用cmake配置vs2013的cuda项目,并获取“无效设备功能”错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-G
生成设备调试代码,这是很好,如果这是你想要的。然而,它通常会比没有该开关编译的代码运行得慢。如果你想调试,那么该开关是必要的。project(CUDA_PART)
# required cmake version
cmake_minimum_required(VERSION 3.0)
include_directories(${CUDA_PART_SOURCE_DIR}/common)
# packages
find_package(CUDA REQUIRED)
# nvcc flags
set(CUDA_NVCC_FLAGS -gencode arch=compute_20,code=sm_20;-G;-g)
set(CUDA_VERBOSE_BUILD ON)
#FILE(GLOB SOURCES "*.cu" "*.cpp" "*.c" "*.h")
CUDA_ADD_EXECUTABLE(CUDA_PART hist_gpu_shmem_atomics.cu)
histo_kernel <<<blocks * 2, 256 >>>(dev_buffer, SIZE, dev_histo);
an "invalid device function" error occurs.
-gencode=arch=compute_20,code=\"sm_20,compute_20\"
It equals to: -gencode=arch=compute_20,code=sm_20 \
-gencode=arch=compute_20,code=compute_20
set(CUDA_NVCC_FLAGS -gencode arch=compute_20,code=sm_20;-G;-g)
set(CUDA_NVCC_FLAGS -gencode arch=compute_52,code=sm_52;-G;-g)
-G
generates device debug code, which is fine if that is what you want. However it will generally run slower than code compiled without that switch. If you want to debug, however, that switch is necessary.