更改CUDA中的拱参数使我使用更多的寄存器 [英] Changing the arch argument in CUDA makes me use more registers

查看：979 发布时间：2017/3/5 19:33:49 cuda nvidia cpu-registers

本文介绍了更改CUDA中的拱参数使我使用更多的寄存器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在我的Tesla K20m上编写一个内核，当我用-Xptas = -v编译软件时，我得到以下结果：

I have been writing a kernel on my Tesla K20m, when I compile the software with -Xptas=-v I obtain the following results :

ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function '_Z9searchKMPPciPhiPiS1_' for 'sm_10'
ptxas info    : Used 8 registers, 80 bytes smem, 8 bytes cmem[1]

你可以看到，只使用了8个寄存器，但是，如果我提到参数-arch = sm_35我的内核执行的时间急剧增加和使用的寄存器数量，我想知道为什么

as you can see, only 8 registers are used, however, if I mention the argument -arch=sm_35 the time my kernel executes raises dramatically and the number of registers used too, and I am wondering why

nvcc mysoftware.cu -Xptxas=-v -arch=sm_35 
ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function '_Z9searchKMPPciPhiPiS1_' for 'sm_35'
ptxas info    : Function properties for _Z9searchKMPPciPhiPiS1_
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 21 registers, 16 bytes smem, 368 bytes cmem[0]

由于在多本书中提到使用正确的卡片架构是为了提高表演，我不知道为什么我的显着减少。

Since in multiple books it was mentioned that using the right architecture for the card was suppose to improve the performances, I wonder why mine are dramatically decreasing.

谢谢。

编辑：
类似问题与答案：

Edit : Similar Question and Answer : Registers and shared memory depending on compiling compute capability?

更改CUDA中的拱参数使我使用更多的寄存器 [英] Changing the arch argument in CUDA makes me use more registers

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

更改CUDA中的拱参数使我使用更多的寄存器 [英] Changing the arch argument in CUDA makes me use more registers

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭