Cuda Hello World printf即使在-arch = sm_20下也无法使用 [英] Cuda Hello World printf not working even with -arch=sm_20

查看:254
本文介绍了Cuda Hello World printf即使在-arch = sm_20下也无法使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不认为我是Cuda的完全新手,但显然我是。



我最近将cuda设备升级到了1.3到2.1的能力( Geforce GT 630)。我还想对Cuda工具包5.0进行全面升级。



我可以编译通用的cuda内核,但是即使设置了-arch = sm_20,printf也无法正常工作。 / p>

代码:

  #include< stdio.h> 
#include< assert.h>
#include< cuda.h>
#include< cuda_runtime.h>

__global__ void test(){

printf( Hi Cuda World);
}

int main(int argc,char ** argv)
{

test<<< 1,1>>>> ;();
返回0;
}

编译器:

 错误2错误MSB3721:命令 C:\Program Files\NVIDIA GPU计算工具包\CUDA\v5.0\bin\nvcc.exe -gencode = arch = compute_10,code = \ sm_20,compute_10\ --use-local-env --cl-version 2010 -ccbin C:\Program Files(x86)\Microsoft Visual Studio 10.0\VC \bin -I C:\程序文件\NVIDIA GPU计算工具包\CUDA\v5.0\包括 -I C:\程序文件\NVIDIA GPU计算工具包\CUDA\ \v5.0\include -G --keep-dir Debug -maxrregcount = 0 --machine 32 --compile -arch = sm_20 -g -D_MBCS -Xcompiler / EHsc / W3 / nologo / Od / Zi / RTC1 / MDd -o Debug\main.cu.obj d:\userstore\documents\visual studio 2010\Projects\testCuda\testCuda\main.cu,代码退出2. C:\Program Files(x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 5.0.targets 592 10 testCuda 
错误1错误:调用__host__函数(不允许使用__global__函数(测试)中的printf) d:\userstore\documents\visual studio 2010\Projects\testCuda\testCuda\main.cu 9 1 testCuda

由于这个问题,我即将毕生难忘...完成了。

解决方案

在内核中,printf仅在计算能力2或更高版本的硬件中受支持。由于您的项目被设置为同时具有 计算能力1.0和2.1的能力,因此nvcc会多次编译代码并构建一个多体系结构的胖对象。正是在计算功能1.0编译周期内生成错误,因为 printf 调用不受该体系结构的支持。


如果从项目中删除了计算能力1.0的构建目标,该错误将消失。


您也可以这样编写内核:

  __ global__ void test()
{
#if __CUDA_ARCH__> = 200
printf( Hi Cuda World) ;
#endif
}

__ CUDA_ARCH __ 符号仅在为计算能力2.0或更高目标而构建时为> = 200,这将使您可以为计算能力1.x设备编译此代码而不会遇到语法错误。


在为正确的体系结构进行编译且没有输出时,您还需要确保内核完成并且驱动程序刷新了输出缓冲区。为此,在内核启动后在主机代码中添加一个同步调用


,例如:

  int main(int argc,char ** argv)
{

test(< 1,1>)();
cudaDeviceSynchronize();
返回0;
}

[免责声明:所有用浏览器编写的代码,未经编译,使用风险自负]


如果同时做这两种事情,则应该能够编译,运行并查看输出。


I didn't think I was a complete newbie with Cuda, but apparently I am.

I recently upgraded my cuda device to one capable capability 1.3 to 2.1 (Geforce GT 630). I thought to do a full upgrade to Cuda toolkit 5.0 as well.

I can compile general cuda kernels, but printf is not working even with -arch=sm_20 set.

Code:

#include <stdio.h>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>

__global__ void test(){

    printf("Hi Cuda World");
}

int main( int argc, char** argv )
{

    test<<<1,1>>>();
        return 0;
}

Compiler:

Error   2   error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_20,compute_10\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include"  -G   --keep-dir "Debug" -maxrregcount=0  --machine 32 --compile -arch=sm_20  -g   -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o "Debug\main.cu.obj" "d:\userstore\documents\visual studio 2010\Projects\testCuda\testCuda\main.cu"" exited with code 2.  C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 5.0.targets  592 10  testCuda
Error   1   error : calling a __host__ function("printf") from a __global__ function("test") is not allowed d:\userstore\documents\visual studio 2010\Projects\testCuda\testCuda\main.cu    9   1   testCuda

I'm about done with life because of this problem...done done done. Please talk me down from the rooftops with an answer.

解决方案

In kernel printf is only supported in compute capability 2 or higher hardware. Because your project is set to build for both compute capability 1.0 and compute 2.1, nvcc compiles the code multiple times and builds a multi-architecture fatbinary object. It is during the compute capability 1.0 compilation cycle that the error is being generated, because the printf call is unsupported for that architecture.

If you remove the compute capability 1.0 build target from your project, the error will disappear.

You could alternatively, write the kernel like this:

__global__ void test()
{
#if __CUDA_ARCH__ >= 200
    printf("Hi Cuda World");
#endif
}

The __CUDA_ARCH__ symbol will only be >= 200 when building for compute capability 2.0 or high targets and this would allow you to compile this code for compute capability 1.x devices without encountering a syntax error.

When compiling for the correct architecture and getting no output, you also need to ensure that the kernel finishes and the driver flushes the output buffer. To do this add a synchronizing call after the kernel launch in the host code

for example:

int main( int argc, char** argv )
{

    test<<<1,1>>>();
    cudaDeviceSynchronize();
    return 0;
}

[disclaimer: all code written in browser, never compiled, use at own risk]

If you do both things, you should be able to compile, run and see output.

这篇关于Cuda Hello World printf即使在-arch = sm_20下也无法使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆