如何分离CUDA code成多个文件 [英] How to separate CUDA code into multiple files

查看:336
本文介绍了如何分离CUDA code成多个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想分开的CUDA程序分为两个独立的.CU文件努力边缘接近用C写一个真正的应用程序++。我有一个简单的小程序是:

I am trying separate a CUDA program into two separate .cu files in effort to edge closer to writing a real app in C++. I have a simple little program that:

分配的主机和设备上的存储器中。结果,
初始化主机阵列一系列数字。
主机阵列复制到设备阵列
发现该阵列中的所有元素的使用设备内核的方
复制装置阵列回主机阵列
打印结果

Allocates a memory on the host and the device.
Initializes the host array to a series of numbers. Copies the host array to a device array Finds the square of all the elements in the array using a device kernel Copies the device array back to the host array Prints the results

如果我把它们都放在一个.CU文件,并运行这个伟大的工程。当我把它分成两个独立的文件我开始得到链接错误。像所有我最近的问题,我知道这是小东西,但它是什么?

This works great if I put it all in one .cu file and run it. When I split it into two separate files I start getting linking errors. Like all my recent questions, I know this is something small, but what is it?

KernelSupport.cu

KernelSupport.cu

#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_

#include <iostream>
#include <MyKernel.cu>

int main( int argc, char** argv) 
{
    int* hostArray;
    int* deviceArray;
    const int arrayLength = 16;
    const unsigned int memSize = sizeof(int) * arrayLength;

    hostArray = (int*)malloc(memSize);
    cudaMalloc((void**) &deviceArray, memSize);

    std::cout << "Before device\n";
    for(int i=0;i<arrayLength;i++)
    {
        hostArray[i] = i+1;
        std::cout << hostArray[i] << "\n";
    }
    std::cout << "\n";

    cudaMemcpy(deviceArray, hostArray, memSize, cudaMemcpyHostToDevice);
    TestDevice <<< 4, 4 >>> (deviceArray);
    cudaMemcpy(hostArray, deviceArray, memSize, cudaMemcpyDeviceToHost);

    std::cout << "After device\n";
    for(int i=0;i<arrayLength;i++)
    {
        std::cout << hostArray[i] << "\n";
    }

    cudaFree(deviceArray);
    free(hostArray);

    std::cout << "Done\n";
}

#endif

MyKernel.cu

MyKernel.cu

#ifndef _MY_KERNEL_
#define _MY_KERNEL_

__global__ void TestDevice(int *deviceArray)
{
    int idx = blockIdx.x*blockDim.x + threadIdx.x;
    deviceArray[idx] = deviceArray[idx]*deviceArray[idx];
}


#endif

生成日志:

1>------ Build started: Project: CUDASandbox, Configuration: Debug x64 ------
1>Compiling with CUDA Build Rule...
1>"C:\CUDA\bin64\nvcc.exe"    -arch sm_10 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin"    -Xcompiler "/EHsc /W3 /nologo /O2 /Zi   /MT  "  -maxrregcount=32  --compile -o "x64\Debug\KernelSupport.cu.obj" "d:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\CUDASandbox\KernelSupport.cu" 
1>KernelSupport.cu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.gpu
1>tmpxft_000016f4_00000000-8_KernelSupport.cudafe2.gpu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.cpp
1>tmpxft_000016f4_00000000-12_KernelSupport.ii
1>Linking...
1>KernelSupport.cu.obj : error LNK2005: __device_stub__Z10TestDevicePi already defined in MyKernel.cu.obj
1>KernelSupport.cu.obj : error LNK2005: "void __cdecl TestDevice__entry(int *)" (?TestDevice__entry@@YAXPEAH@Z) already defined in MyKernel.cu.obj
1>D:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\x64\Debug\CUDASandbox.exe : fatal error LNK1169: one or more multiply defined symbols found
1>Build log was saved at "file://d:\Stuff\Programming\Visual Studio 2008\Projects\CUDASandbox\CUDASandbox\x64\Debug\BuildLog.htm"
1>CUDASandbox - 3 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

我在Windows 7 64位运行Visual Studio 2008。

I am running Visual Studio 2008 on Windows 7 64bit.

编辑:

我想我需要详细阐述了这点。最终的结果我找了这里是有像Main.cpp的使用 INT的main()事件正常的C ++应用程序,并有东西从那里运行。在某些码在我的.cpp code点我希望能够引用CUDA位。所以我的想法(和纠正我,如果这里有一个更标准约定)是我会把CUDA内核code到他们的.CU文件,然后有一个配套文件.CU将采取谈话的护理设备和调用内核函数,什么不是。

I think I need to elaborate on this a little bit. The end result I am looking for here is to have a normal C++ application with something like Main.cpp with the int main() event and have things run from there. At certains point in my .cpp code I want to be able to reference CUDA bits. So my thinking (and correct me if there a more standard convention here) is that I will put the CUDA Kernel code into their on .cu files, and then have a supporting .cu file that will take care of talking to the device and calling kernel functions and what not.

推荐答案

您包括 mykernel.cu kernelsupport.cu ,当您尝试编译链接看到mykernel.cu两次。你必须创建定义为TestDevice头和包括替代。

You are including mykernel.cu in kernelsupport.cu, when you try to link the compiler sees mykernel.cu twice. You'll have to create a header defining TestDevice and include that instead.

再评论:

像这样的东西应该工作

// MyKernel.h
#ifndef mykernel_h
#define mykernel_h
__global__ void TestDevice(int* devicearray);
#endif

和则包括文件改为

//KernelSupport.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_

#include <iostream>
#include <MyKernel.h>
// ...

重新您的编辑

只要你在C code使用没有任何CUDA具体的东西头( __内核__ __全球__ 等),你应该罚款链接C ++和CUDA code。

As long as the header you use in c++ code doesn't have any cuda specific stuff (__kernel__,__global__, etc) you should be fine linking c++ and cuda code.

这篇关于如何分离CUDA code成多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆