如何分离CUDA code成多个文件 [英] How to separate CUDA code into multiple files
问题描述
我想分开的CUDA程序分为两个独立的.CU文件努力边缘接近用C写一个真正的应用程序++。我有一个简单的小程序是:
I am trying separate a CUDA program into two separate .cu files in effort to edge closer to writing a real app in C++. I have a simple little program that:
分配的主机和设备上的存储器中。结果,
初始化主机阵列一系列数字。
主机阵列复制到设备阵列
发现该阵列中的所有元素的使用设备内核的方
复制装置阵列回主机阵列
打印结果
Allocates a memory on the host and the device.
Initializes the host array to a series of numbers.
Copies the host array to a device array
Finds the square of all the elements in the array using a device kernel
Copies the device array back to the host array
Prints the results
如果我把它们都放在一个.CU文件,并运行这个伟大的工程。当我把它分成两个独立的文件我开始得到链接错误。像所有我最近的问题,我知道这是小东西,但它是什么?
This works great if I put it all in one .cu file and run it. When I split it into two separate files I start getting linking errors. Like all my recent questions, I know this is something small, but what is it?
KernelSupport.cu
KernelSupport.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_
#include <iostream>
#include <MyKernel.cu>
int main( int argc, char** argv)
{
int* hostArray;
int* deviceArray;
const int arrayLength = 16;
const unsigned int memSize = sizeof(int) * arrayLength;
hostArray = (int*)malloc(memSize);
cudaMalloc((void**) &deviceArray, memSize);
std::cout << "Before device\n";
for(int i=0;i<arrayLength;i++)
{
hostArray[i] = i+1;
std::cout << hostArray[i] << "\n";
}
std::cout << "\n";
cudaMemcpy(deviceArray, hostArray, memSize, cudaMemcpyHostToDevice);
TestDevice <<< 4, 4 >>> (deviceArray);
cudaMemcpy(hostArray, deviceArray, memSize, cudaMemcpyDeviceToHost);
std::cout << "After device\n";
for(int i=0;i<arrayLength;i++)
{
std::cout << hostArray[i] << "\n";
}
cudaFree(deviceArray);
free(hostArray);
std::cout << "Done\n";
}
#endif
MyKernel.cu
MyKernel.cu
#ifndef _MY_KERNEL_
#define _MY_KERNEL_
__global__ void TestDevice(int *deviceArray)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
deviceArray[idx] = deviceArray[idx]*deviceArray[idx];
}
#endif
生成日志:
1>------ Build started: Project: CUDASandbox, Configuration: Debug x64 ------
1>Compiling with CUDA Build Rule...
1>"C:\CUDA\bin64\nvcc.exe" -arch sm_10 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin" -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MT " -maxrregcount=32 --compile -o "x64\Debug\KernelSupport.cu.obj" "d:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\CUDASandbox\KernelSupport.cu"
1>KernelSupport.cu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.gpu
1>tmpxft_000016f4_00000000-8_KernelSupport.cudafe2.gpu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.cpp
1>tmpxft_000016f4_00000000-12_KernelSupport.ii
1>Linking...
1>KernelSupport.cu.obj : error LNK2005: __device_stub__Z10TestDevicePi already defined in MyKernel.cu.obj
1>KernelSupport.cu.obj : error LNK2005: "void __cdecl TestDevice__entry(int *)" (?TestDevice__entry@@YAXPEAH@Z) already defined in MyKernel.cu.obj
1>D:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\x64\Debug\CUDASandbox.exe : fatal error LNK1169: one or more multiply defined symbols found
1>Build log was saved at "file://d:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\CUDASandbox\x64\Debug\BuildLog.htm"
1>CUDASandbox - 3 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
我在Windows 7 64位运行Visual Studio 2008。
I am running Visual Studio 2008 on Windows 7 64bit.
编辑:
我想我需要详细阐述了这点。最终的结果我找了这里是有像Main.cpp的使用 INT的main()
事件正常的C ++应用程序,并有东西从那里运行。在某些码在我的.cpp code点我希望能够引用CUDA位。所以我的想法(和纠正我,如果这里有一个更标准约定)是我会把CUDA内核code到他们的.CU文件,然后有一个配套文件.CU将采取谈话的护理设备和调用内核函数,什么不是。
I think I need to elaborate on this a little bit. The end result I am looking for here is to have a normal C++ application with something like Main.cpp with the int main()
event and have things run from there. At certains point in my .cpp code I want to be able to reference CUDA bits. So my thinking (and correct me if there a more standard convention here) is that I will put the CUDA Kernel code into their on .cu files, and then have a supporting .cu file that will take care of talking to the device and calling kernel functions and what not.
推荐答案
您包括 mykernel.cu
在 kernelsupport.cu
,当您尝试编译链接看到mykernel.cu两次。你必须创建定义为TestDevice头和包括替代。
You are including mykernel.cu
in kernelsupport.cu
, when you try to link the compiler sees mykernel.cu twice. You'll have to create a header defining TestDevice and include that instead.
再评论:
像这样的东西应该工作
// MyKernel.h
#ifndef mykernel_h
#define mykernel_h
__global__ void TestDevice(int* devicearray);
#endif
和则包括文件改为
//KernelSupport.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_
#include <iostream>
#include <MyKernel.h>
// ...
重新您的编辑
只要你在C code使用没有任何CUDA具体的东西头( __内核__
, __全球__
等),你应该罚款链接C ++和CUDA code。
As long as the header you use in c++ code doesn't have any cuda specific stuff (__kernel__
,__global__
, etc) you should be fine linking c++ and cuda code.
这篇关于如何分离CUDA code成多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!