链接CUDA +纯C ++代码:对`__fatbinwrap_66_tmpxft_等的未定义引用 [英] Linking CUDA + plain C++ code: undefined reference to `__fatbinwrap_66_tmpxft_ etc
问题描述
以某种方式,我的CUDA二进制生成过程被弄乱了.所有.cu文件都可以很好地编译为.o文件,但是当我尝试链接时,我得到了:
Somehow my CUDA binary build process has been messed up. All of the .cu files compile nicely to .o files, but when I try to link, I get:
CMakeFiles/tester.dir/tester_intermediate_link.o: In function `__cudaRegisterLinkedBinary_66_tmpxft_00007a5f_00000000_16_cuda_device_runtime_compute_52_cpp1_ii_8b1a5d37':
/tmp/tmpxft_00006b54_00000000-2_tester_intermediate_link.reg.c:7: undefined reference to `__fatbinwrap_66_tmpxft_00007a5f_00000000_16_cuda_device_runtime_compute_52_cpp1_ii_8b1a5d37'
现在,我没有在任何地方使用过compute_52.我的nvcc命令行是:
Now, I have not used compute_52 anywhere. My nvcc command-line is:
/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /home/joeuser/src/my_project/src/kernel_specific/elementwise/Add.cu -o /home/joeuser/src/my_project/CMakeFiles/tester.dir/src/kernel_specific/elementwise/tester_generated_Add.cu.o.NVCC-depend -ccbin /usr/bin/gcc-4.9.3 -m64 --std c++11 -D__STRICT_ANSI__ -Xcompiler ,\"-Wall\",\"-g\",\"-g\",\"-O0\" -gencode arch=compute_35,code=compute_35 -g -G --generate-line-info -DNVCC -I/usr/local/cuda/include -I/opt/cub -I/usr/local/cuda/include
我的链接行是:
/usr/bin/g++-4.9.3 -Wall -std=c++11 -g some.o files.o here.o blah.o blahblah.o bar.cu.o baz.cu.o -o bin/myapp -rdynamic -Wl,-Bstatic -lcudart_static -Wl,-Bdynamic -lpthread -lrt -ldl /usr/lib/libboost_system.so /usr/lib/libboost_program_options.so -Wl,-Bstatic -lcudart_static -Wl,-Bdynamic -lpthread -lrt -ldl /usr/local/cuda/extras/CUPTI/lib64/libcupti.so -lnvToolsExt -lOpenCL /usr/lib/libboost_system.so /usr/lib/libboost_program_options.so /usr/local/cuda/extras/CUPTI/lib64/libcupti.so -lnvToolsExt -lOpenCL -Wl,-rpath,/usr/lib:/usr/local/cuda/extras/CUPTI/lib64
我会注意到我启用了单独的编译,并且似乎没有跳过中间链接阶段.
I'll note I have separate compilation enabled, and do not seem to have skipped my intermediate link phase.
为什么会这样?
推荐答案
CUDA具有可重定位和静态的两种编译模式.
可重定位模式是某些配置所必需的,我们现在不再赘述.
CUDA has two compilation modes, relocatable and static.
The relocatable mode is required for some configurations-which we will not get into now.
如果要以可重定位模式-rdc=true
进行编译,则需要Cuda设备运行时库.
哪个位于文件cudadevrt.lib
.
在某些情况下,将-lcudadevrt
作为命令行开关提供给CUDA链接器可以完成此任务,但是例如MSVC,您还需要指定cudadebrt.lib
作为链接依赖项.
If you want to compile in relocatable mode -rdc=true
, you'll need the Cuda device runtime library.
Which is located in the file cudadevrt.lib
.
On some instances, supplying -lcudadevrt
as a command line switch to the CUDA linker does the job, but on e.g. MSVC, you'll also need to specify cudadebrt.lib
as a link dependency.
这篇关于链接CUDA +纯C ++代码:对`__fatbinwrap_66_tmpxft_等的未定义引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!