CUDA 5.0库单独编译与cmake [英] CUDA 5.0 separate compilation of library with cmake

查看：432 发布时间：2016/12/2 22:44:06 compilation cuda cmake

本文介绍了CUDA 5.0库单独编译与cmake的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的cuda库的构建时间正在增加，所以我认为CUDA 5.0中引入的单独编译可能会帮助我。我不知道如何实现单独的编译与cmake。我研究了NVCC文档，并找到如何编译设备对象（使用-dc选项）以及如何链接它们（使用-dlink）。我试图让它运行使用cmake失败。我使用cmake 2.8.10.2和FindCUDA.cmake的trunk的头。我不知道如何指定应该编译哪些文件以及如何将它们链接到库中。
特别是函数（CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS output_file_var cuda_target选项object_files source_files）的语法对我不清楚，因为我不知道 output_file_var 和 cuda_target 。
这里没有我的尝试的工作结果：

  cuda_compile（DEVICEMANAGER_O devicemanager.cu选项-dc）
 cuda_compile（BLUB_O blub.cu OPTIONS-dc）
 CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS（TEST_O gpuacceleration 
DEVICEMANGER_O BLUB_O）
 set（LIB_TYPE SHARED）
 #cuda_add_library（gpuacceleration $ {LIB_TYPE} 
＃$ {gpuacc_SRCS} 
＃devicemanager.cu 
＃blub.cu 
 #DEVICEMANAGER_O 
＃TEST_O 
＃）

有没有人知道如何使用cmake编译和链接cuda库？
提前感谢。

编辑：
在朋友咨询了FindCUDA.cmake的开发人员之后，在FindCUDA.cmake提供的示例中得到修复（ https：/ /gforge.sci.utah.edu/gf/project/findcuda/scmsvn/?action=browse&path=%2F 结帐％2Ftrunk％2FFindCuda.html）。
现在我可以构建这个例子了。
在我的项目中，我可以根据需要使用以下内容构建库（需要cmake 2.8.10）：

  （LIB_TYPE SHARED）
 set（CUDA_SEPARABLE_COMPILATION ON）
 cuda_add_library（gpuacceleration $ {LIB_TYPE} 
 blub.cu 
 blab.cu 
）

但是：
我无法链接到此库。当我构建lib没有单独的编译，我能够链接到它。
现在得到以下错误：

 未定义引用`__cudaRegisterLinkedBinary_53_tmpxft_00005ab4_00000000_6_blub_cpp1_ii_d07d5695'

用于接口中使用的函数。似乎很奇怪，因为它建立没有任何警告等
任何想法如何让这个工作？

编辑：
我终于找到了如何做到这一点。详情请参阅@ PHD和我的答案。

解决方案

EDIT（2016-03-15）：是的，它被FindCUDA确认为一个错误： https://cmake.org/Bug/view.php?id=15157

TL; DR：这似乎是FindCUDA中的一个错误，它使得对象在最终链接之前对外部定义是松散的。 >

问题是，即使启用了可分离的编译，仍然在最终连接之前对所有目标单独执行链接步骤。

例如，我有 module.cu ：

  #includemodule.h
 #include< cstdio> 
 
 double arr [10] = {1,2,3,4,5,6,7,8,9,10}; 
 __constant__ double carr [10]; 
 
 void init_carr（）{
 cudaMemcpyToSymbol（carr，arr，10 * sizeof（double））; 
} 
 
 __global__ void pkernel（）{
 printf（（pkernel）carr [％d] =％g\\\
，threadIdx.x，carr [threadIdx。 X]）; 
} 
 
 void print_carr（）{
 printf（in print_carr\\\
）; 
 pkernel<<< 1>>>>（）; 
}

和 module.h with：

  extern __constant__ double carr [10]; 
 extern double arr [10]; 
 
 void print_carr（）; 
 void init_carr（）;

最后 main.cu / p>

  #includemodule.h
 
 #include< cstdio> 
 
 __global__ void kernel（）{
 printf（（kernel）carr [％d] =％g\\\
，threadIdx.x，carr [threadIdx.x]）; 
} 
 
 
 int main（int argc，char * argv []）{
 printf（arr：％g％g％g .. \\\
 ，arr [0]，arr [1]，arr [2]）; 
 
 kernel<<< 1,10>>>（）; 
 cudaDeviceSynchronize（）; 
 print_carr（）; 
 cudaDeviceSynchronize（）; 
 init_carr（）; 
 cudaDeviceSynchronize（）; 
 kernel<<< 1,10>>>（）; 
 cudaDeviceSynchronize（）; 
 print_carr（）; 
 cudaDeviceSynchronize（）; 
 
 return 0; 
}

这样就可以正常工作了 Makefile ：

  NVCC = nvcc 
 NVCCFLAGS = -arch = sm_20 
 LIB = libmodule .a 
 OBJS = module.o main.o 
 PROG = extern 
 
 $（PROG）：main.o libmodule.a 
 $（NVCC）$ NVCCFLAGS）-o $ @ $ ^ 
 
％.o：％.cu 
 $（NVCCFLAGS）-dc -c -o $ @ $ ^ 
 
 $（LIB）：module.o 
 ar cr $ @ $ ^ 
 
 clean：
 $（RM）$（PROG）$（OBJS）$ ）

但是我试着使用下面的 CMakeLists.txt ：

  CMAKE_MINIMUM_REQUIRED（VERSION 2.8.8）
 
 PROJECT（extern）
 
 FIND_PACKAGE（CUDA REQUIRED）
 SET（CUDA_SEPARABLE_COMPILATION ON）
 
 SITE_NAME（HOSTNAME）
 
 SET（CUDA_NVCC_FLAGS $ {CUDA_NVCC_FLAGS} -arch = sm_20）
 
 cuda_add_library（module module.cu）
 
 CUDA_ADD_EXECUTABLE（extern main.cu）
 TARGET_LINK_LIBRARIES（extern module）

然后编译时，会发生以下情况：

  $ cmake .. 
  -  C编译器标识为GNU 4.9.2 
 ... 
 $ make VERBOSE = 1 
 ... 
 [25％]建立NVCC（设备）对象CMakeFiles / module.dir //./ module_generated_module.cu.o 
 ... 
  - 生成< ...& CMakeFiles / module.dir //./ module_generated_module.cu.o 
 / usr / local / cuda / bin / nvcc< ...> /module.cu -dc -o< ...> /build/CMakeFiles/module.dir//./module_generated_module.cu.o -ccbin / usr / bin / cc -m64 -Xcompiler，\-g \-arch = sm_20 -DNVCC -I / usr / local / cuda / include 
 [50％]建立NVCC中间链接文件CMakeFiles / module.dir /./ module_intermediate_link.o 
 / usr / local / cuda / bin / nvcc -arch = sm_20 -m64 -ccbin / usr / bin / cc-dlink< ...> /build/CMakeFiles/module.dir //./ module_generated_module.cu.o -o< ...> / build / CMakeFiles / module。 dir /./ module_intermediate_link.o 
 ... 
 / usr / bin / ar cr libmodule.a CMakeFiles / module.dir /./ module_generated_module.cu.o CMakeFiles / module.dir /./ module_intermediate_link 。$ 
 / usr / bin / ranlib libmodule.a 
 ... 
 [50％]建立目标模块
 [75％]建立NVCC设备对象CMakeFiles / extern .dir //./ extern_generated_main.cu.o 
 ... 
  - 生成< ...> /build/CMakeFiles/extern.dir //./ extern_generated_main.cu.o 
 / usr / local / cuda / bin / nvcc< ...> /main.cu -dc -o< ...> /build/CMakeFiles/extern.dir //./ extern_generated_main.cu .o -ccbin / usr / bin / cc -m64 -Xcompiler，\-g \-arch = sm_20 -DNVCC -I / usr / local / cuda / include -I / usr / local / cuda / include 
 ... 
 [100％]建立NVCC中间链接文件CMakeFiles / extern.dir /./ extern_intermediate_link.o 
 / usr / local / cuda / bin / nvcc -arch = sm_20 -m64 -ccbin/ usr / bin / cc-dlink< ...> /build/CMakeFiles/extern.dir //./ extern_generated_main.cu.o -o< ...> / build / CMakeFiles / extern.dir /./ extern_intermediate_link.o 
 nvlink错误：未定义对'carr'在'< ...> /build/CMakeFiles/extern.dir //./ extern_generated_main.cu.o'中的引用b $ b

显然，问题是 nvcc -dlink obj.o -o obj_intermediate_link。 o 行。然后，我猜，外部定义的信息丢失。所以，问题是，有可能使CMake / FindCUDA不做这个额外的链接步骤？

 
 
 否则，我会认为这是一个错误。你同意吗？我可以用CMake提交一个错误报告。
 
The buildtime of my cuda library is increasing and so I thought that separate compilation introduced in CUDA 5.0 might help me. I couldn't figure out how to achieve separate compilation with cmake. I looked into the NVCC documentation and found how to compile device object (using the -dc option) and how to link them (using the -dlink). My attempts to get it running using cmake failed. I'm using cmake 2.8.10.2 and the head of the trunk of the FindCUDA.cmake. I couldn't however find out how to specify which files should be compiled and how to link them into a library.
Especially the syntax of the function(CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS output_file_var cuda_target options object_files source_files)
 is unclear to me because I don't know what the output_file_var and the cuda_target are.
Here the not working results of my attemps:
cuda_compile(DEVICEMANAGER_O devicemanager.cu OPTIONS -dc)
cuda_compile(BLUB_O blub.cu OPTIONS -dc)
CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS(TEST_O gpuacceleration
                                          ""  DEVICEMANGER_O BLUB_O)
set(LIB_TYPE SHARED)
#cuda_add_library(gpuacceleration ${LIB_TYPE} 
  #${gpuacc_SRCS} 
  #devicemanager.cu
  # blub.cu
  #DEVICEMANAGER_O
#  TEST_O
#)
Does anyone know how to compile and link a cuda library using cmake?
Thanks in advance.

EDIT:
After a friend consulted the developer of the FindCUDA.cmake, a bug got fixed in the example provided with FindCUDA.cmake (https://gforge.sci.utah.edu/gf/project/findcuda/scmsvn/?action=browse&path=%2Fcheckout%2Ftrunk%2FFindCuda.html).
I'm now able to build the example.
In my project I can build the library as needed using the following (cmake 2.8.10 required):
set(LIB_TYPE SHARED)
set(CUDA_SEPARABLE_COMPILATION ON)
cuda_add_library(gpuacceleration ${LIB_TYPE} 
 blub.cu
 blab.cu
)
BUT:
I cannot link against this library. When I builded the lib without separate compilation i was able to link against it.
Now getting the following error:
 undefined reference to `__cudaRegisterLinkedBinary_53_tmpxft_00005ab4_00000000_6_blub_cpp1_ii_d07d5695'
for every file with a function used in the interface. Seems strange since it builds without any warning etc.
Any ideas how to get this working?

EDIT:
I finally figured out how to do this. See @PHD's and my answer for details. 
 解决方案 
EDIT (2016-03-15): Yes, it is confirmed as a bug in FindCUDA: https://cmake.org/Bug/view.php?id=15157



TL;DR: This seems to be a bug in FindCUDA, which makes objects loose info on external definitions before the final linking.

The problem is that, even if separable compilation is enabled, a linking step is still performed for all the targets individually before the final linking.

For instance, I have module.cu with:
#include "module.h"
#include <cstdio>

double arr[10] = {1,2,3,4,5,6,7,8,9,10};
__constant__ double carr[10];

void init_carr() {
  cudaMemcpyToSymbol(carr,arr,10*sizeof(double));
}

__global__ void pkernel() {
  printf("(pkernel) carr[%d]=%g\n",threadIdx.x,carr[threadIdx.x]);
}

void print_carr() {
  printf("in print_carr\n");
  pkernel<<<1,10>>>();
}
and module.h with:
extern __constant__ double carr[10];
extern double arr[10];

void print_carr();
void init_carr();
and finally main.cu with:
#include "module.h"

#include <cstdio>

__global__ void kernel() {
  printf("(kernel) carr[%d]=%g\n",threadIdx.x,carr[threadIdx.x]);
}


int main(int argc, char *argv[]) {
  printf("arr: %g %g %g ..\n",arr[0],arr[1],arr[2]);

  kernel<<<1,10>>>();
  cudaDeviceSynchronize();
  print_carr();
  cudaDeviceSynchronize();
  init_carr();
  cudaDeviceSynchronize();
  kernel<<<1,10>>>();
  cudaDeviceSynchronize();
  print_carr();
  cudaDeviceSynchronize();

  return 0;
}
This then works fine with the following Makefile:
NVCC=nvcc
NVCCFLAGS=-arch=sm_20
LIB=libmodule.a
OBJS=module.o main.o
PROG=extern

$(PROG): main.o libmodule.a
    $(NVCC) $(NVCCFLAGS) -o $@ $^

%.o: %.cu
    $(NVCC) $(NVCCFLAGS) -dc -c -o $@ $^

$(LIB): module.o
    ar cr $@ $^

clean:
    $(RM) $(PROG) $(OBJS) $(LIB)
But then I try to use the following CMakeLists.txt:
CMAKE_MINIMUM_REQUIRED(VERSION 2.8.8)

PROJECT(extern)

FIND_PACKAGE(CUDA REQUIRED)
SET(CUDA_SEPARABLE_COMPILATION ON)

SITE_NAME(HOSTNAME)

SET(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -arch=sm_20)

cuda_add_library(module module.cu)

CUDA_ADD_EXECUTABLE(extern main.cu)
TARGET_LINK_LIBRARIES(extern module)
When then compiling, what then happens is that the following:
$ cmake ..
-- The C compiler identification is GNU 4.9.2
...
$ make VERBOSE=1
...
[ 25%] Building NVCC (Device) object CMakeFiles/module.dir//./module_generated_module.cu.o
...
-- Generating <...>/build/CMakeFiles/module.dir//./module_generated_module.cu.o
/usr/local/cuda/bin/nvcc <...>/module.cu -dc -o <...>/build/CMakeFiles/module.dir//./module_generated_module.cu.o -ccbin /usr/bin/cc -m64 -Xcompiler ,\"-g\" -arch=sm_20 -DNVCC -I/usr/local/cuda/include
[ 50%] Building NVCC intermediate link file CMakeFiles/module.dir/./module_intermediate_link.o
/usr/local/cuda/bin/nvcc -arch=sm_20 -m64 -ccbin "/usr/bin/cc" -dlink <...>/build/CMakeFiles/module.dir//./module_generated_module.cu.o -o <...>/build/CMakeFiles/module.dir/./module_intermediate_link.o
...
/usr/bin/ar cr libmodule.a  CMakeFiles/module.dir/./module_generated_module.cu.o CMakeFiles/module.dir/./module_intermediate_link.o
/usr/bin/ranlib libmodule.a
...
[ 50%] Built target module
[ 75%] Building NVCC (Device) object CMakeFiles/extern.dir//./extern_generated_main.cu.o
...
-- Generating <...>/build/CMakeFiles/extern.dir//./extern_generated_main.cu.o
/usr/local/cuda/bin/nvcc <...>/main.cu -dc -o <...>/build/CMakeFiles/extern.dir//./extern_generated_main.cu.o -ccbin /usr/bin/cc -m64 -Xcompiler ,\"-g\" -arch=sm_20 -DNVCC -I/usr/local/cuda/include -I/usr/local/cuda/include
...
[100%] Building NVCC intermediate link file CMakeFiles/extern.dir/./extern_intermediate_link.o
/usr/local/cuda/bin/nvcc -arch=sm_20 -m64 -ccbin "/usr/bin/cc" -dlink <...>/build/CMakeFiles/extern.dir//./extern_generated_main.cu.o -o <...>/build/CMakeFiles/extern.dir/./extern_intermediate_link.o
nvlink error   : Undefined reference to 'carr' in '<...>/build/CMakeFiles/extern.dir//./extern_generated_main.cu.o'
Clearly, the problem are the nvcc -dlink obj.o -o obj_intermediate_link.o lines. Then, I guess, the info on external definitions are lost. So, the question is, it is possible to make CMake/FindCUDA not do this extra linking step?

Otherwise, I would argue that this is a bug. Do you agree? I can file a bug report with CMake.

                        这篇关于CUDA 5.0库单独编译与cmake的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

CUDA 5.0库单独编译与cmake [英] CUDA 5.0 separate compilation of library with cmake

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

CUDA 5.0库单独编译与cmake [英] CUDA 5.0 separate compilation of library with cmake

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭