CUDA：如何检查正确的计算能力？ [英] CUDA: How to check for the right compute capability?

查看：777 发布时间：2017/3/4 12:14:07 cuda

本文介绍了CUDA：如何检查正确的计算能力？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用更高计算能力编译的CUDA代码在具有较低计算能力的设备上长时间执行，在某些内核中默认失败一天。我花了半天时间追逐一个难以捉摸的bug，只是意识到构建规则有 sm_21 ，而设备（Tesla C2050）是 2.0 。

CUDA code compiled with a higher compute capability will execute perfectly for a long time on a device with lower compute capability, before silently failing one day in some kernel. I spent half a day chasing an elusive bug only to realize that the Build Rule had sm_21 while the device (Tesla C2050) was a 2.0.

有没有可以添加的CUDA API代码，可以自行检查它是否在兼容计算能力的设备上运行？我需要编译和使用许多计算能力的设备。

Is there any CUDA API code I can add which can self-check if it is running on a device with compatible compute capability? I need to compile and work with devices of many compute capabilities. Is there any other action I can take to ensure such errors do not occur?

推荐答案

在运行时API中， cudaGetDeviceProperties 返回两个字段 major 和 minor ，返回任何给定的枚举CUDA设备的计算能力。您可以使用它来解析任何GPU的计算能力，然后在其上建立上下文，以确保它是您的代码所做的正确的架构。 nvcc 可以使用 -gencode 选项从单次调用生成包含多个体系结构的对象文件，例如：

In the runtime API, cudaGetDeviceProperties returns two fields major and minor which return the compute capability any given enumerated CUDA device. You can use that to parse the compute capability of any GPU before establishing a context on it to make sure it is the right architecture for what your code does. nvcc can generate a object file containing multiple architectures from a single invocation using the -gencode option, for example:

nvcc -c -gencode arch=compute_20,code=sm_20  \
        -gencode arch=compute_13,code=sm_13
        source.cu

会生成一个输出对象文件，其中包含包含cubin文件的嵌入式fatbinary对象GT200和GF100卡。运行时API将自动处理体系结构检测，并尝试从fatbinary对象加载适当的设备代码，而不需要任何额外的主机代码。

would produce an output object file with an embedded fatbinary object containing cubin files for GT200 and GF100 cards. The runtime API will automagically handle architecture detection and try loading suitable device code from the fatbinary object without any extra host code.

这篇关于CUDA：如何检查正确的计算能力？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

CUDA：如何检查正确的计算能力？ [英] CUDA: How to check for the right compute capability?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

CUDA：如何检查正确的计算能力？ [英] CUDA: How to check for the right compute capability?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭