创建要与C ++程序链接的静态CUDA库 [英] Creating a static CUDA library to be linked with a C++ program

查看:1167
本文介绍了创建要与C ++程序链接的静态CUDA库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一个文件GPUFloydWarshall.cu,它包含了一个内核和一个包装器C函数,我想放入库libgpu.a。这将与项目的其余部分一致。这是可能吗?



其次,库将需要链接到大约10个其他库的主可执行文件,此时使用mpicxx。



目前我正在使用/生成以下命令来编译和创建libgpu.a库

  nvcc -rdc = true -c -o temp.o GPUFloydWarshall.cu 
nvcc -dlink -o GPUFloydWarshall.o temp.o -L / usr / local / cuda / lib64 -lcuda -lcudart
rm -f libgpu.a
ar cru libgpu.a GPUFloydWarshall.o
ranlib libgpu.a

当所有链接到主可执行文件时,我得到以下错误

  problem / libproblem.a (libproblem_a-UTRP.o):在函数`UTRP :: evaluate(Solution&)'中:
UTRP.cpp :(。text + 0x1220):未定义引用`gpu_fw(double *,int)'

Th gpu_fw函数是我的包装函数。


是的,这是可能的。并且创建一个(非CUDA)包装函数使它更容易。如果你依赖于C ++链接(你提到一个包装C函数),你可以使你的生活更容易。 mpicxx是一个C ++编译器/链接器别名,cuda文件(.cu)默认遵循C ++编译器/链接器行为。 这里是一个非常简单的问题,讨论将cuda代码(封装在包装函数中)构建到静态库中。


其次,库需要链接


一旦你有一个C / C ++(非CUDA )包装暴露在你的库中,链接应该没有什么不同于普通链接的普通库。您可能仍需要传递cuda运行时库和您可能在链接步骤中使用的任何其他cuda库,但这与您的项目可能依赖的任何其他库概念相同。



编辑:



不清楚,您需要使用设备链接来做您想做的事。 (但它是可以接受的,它只是复杂的东西有点。)无论如何,你的库的构造是不太正确,现在你已经显示了命令序列。 device link命令生成一个设备可链接对象,不包括所有必需的主机片段。要将所有内容放在一个地方,我们要将GPUFloydWarshall.o(具有设备链接的片段)和 temp.o(其中包含主机代码片段)添加到库中。



下面是一个完整的例子:

  $ cat GPUFloydWarshall.cu 
#include< stdio.h>

__global__ void mykernel(){
printf(hello\\\
);
}

void gpu_fw(){
mykernel<<<< 1,1>>>();
cudaDeviceSynchronize();
}


$ cat main.cpp
#include< stdio.h>

void gpu_fw();

int main(){

gpu_fw();
}

$ nvcc -rdc = true -c -o temp.o GPUFloydWarshall.cu
$ nvcc -dlink -o GPUFloydWarshall.o temp.o -lcudart
$ rm -f libgpu.a
$ ar cru libgpu.a GPUFloydWarshall.o temp.o
$ ranlib libgpu.a
$ g ++ main.cpp -L。 -lgpu -o main -L / usr / local / cuda / lib64 -lcudart
$ ./main
hello
$


I am attempting to link a CUDA kernel with a C++ autotools project however cannot seem to pass the linking stage.

I have a file GPUFloydWarshall.cu that contains the kernel and a wrapper C function that I would like place into a library libgpu.a. This will be consistent with the remainder of the project. Is this at all possible?

Secondly, the library would then need to be linked to around ten other libraries for the main executable which at the moment using mpicxx.

Currently I am using/generating the below commands to compile and create the libgpu.a library

nvcc   -rdc=true -c -o temp.o GPUFloydWarshall.cu
nvcc -dlink -o GPUFloydWarshall.o temp.o -L/usr/local/cuda/lib64 -lcuda -lcudart
rm -f libgpu.a
ar cru libgpu.a GPUFloydWarshall.o
ranlib libgpu.a

When this is all linked into the main executable I get the following error

problem/libproblem.a(libproblem_a-UTRP.o): In function `UTRP::evaluate(Solution&)':
UTRP.cpp:(.text+0x1220): undefined reference to `gpu_fw(double*, int)'

Th gpu_fw function is my wrapper function.

解决方案

Is this at all possible?

Yes, it's possible. And creating a (non-CUDA) wrapper function around it makes it even easier. You can make your life easier still if you rely on C++ linking throughout (you mention a wrapper C function). mpicxx is a C++ compiler/linker alias, and cuda files (.cu) follow C++ compiler/linker behavior by default. Here's a very simple question that discusses building cuda code (encapsulated in a wrapper function) into a static library.

Secondly, the library would then need to be linked to around ten other libraries for the main executable which at the moment using mpicxx.

Once you have a C/C++ (non-CUDA) wrapper exposed in your library, linking should be no different than ordinary linking of ordinary libraries. You may still need to pass the cuda runtime libraries and any other cuda libraries you may be using in the link step, but this is the same conceptually as any other libraries your project may depend on.

EDIT:

It's not clear you need to use device linking for what you want to do. (But it's acceptable, it just complicates things a bit.) Anyway, your construction of the library is not quite correct, now that you have shown the command sequence. The device link command produces a device-linkable object, that does not include all necessary host pieces. To get everything in one place, we want to add both GPUFloydWarshall.o (which has the device-linked pieces) AND temp.o (which has the host code pieces) to the library.

Here's a fully worked example:

$ cat GPUFloydWarshall.cu
#include <stdio.h>

__global__ void mykernel(){
  printf("hello\n");
}

void gpu_fw(){
  mykernel<<<1,1>>>();
  cudaDeviceSynchronize();
}


$ cat main.cpp
#include <stdio.h>

void gpu_fw();

int main(){

  gpu_fw();
}

$ nvcc   -rdc=true -c -o temp.o GPUFloydWarshall.cu
$ nvcc -dlink -o GPUFloydWarshall.o temp.o -lcudart
$ rm -f libgpu.a
$ ar cru libgpu.a GPUFloydWarshall.o temp.o
$ ranlib libgpu.a
$ g++ main.cpp -L. -lgpu -o main -L/usr/local/cuda/lib64 -lcudart
$ ./main
hello
$

这篇关于创建要与C ++程序链接的静态CUDA库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆