__device__ __constant__ const [英] __device__ __constant__ const
问题描述
有什么区别?在CUDA程序中定义设备常数的最佳方法是什么?在C ++,主机/设备程序中,如果我想将常量定义为设备常量内存中的任何一个,我都可以做
Is there any difference and what is the best way to define device constants in a CUDA program? In the C++, host/device program if I want to define constants to be in device constant memory I can do either
__device__ __constant__ float a = 5;
__constant__ float a = 5;
问题1.在设备2.x和CUDA 4上,它是否相同,
Question 1. On devices 2.x and CUDA 4, is it the same as,
__device__ const float a = 5;
问题2。为什么在PyCUDA SourceModule( ...)中,即使执行以下工作,它也只能编译设备代码?
Question 2. Why is it that in PyCUDA SourceModule("""..."""), which compiles only do device code, even the following works?
const float a = 5;
推荐答案
在CUDA中 __ constant __
是变量类型限定符,它指示要声明的变量将存储在设备常量存储器中。引用第节href = https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html rel = nofollow noreferrer> CUDA编程指南
In CUDA __constant__
is a variable type qualifier that indicates the variable being declared is to be stored in device constant memory. Quoting section B 2.2 of the CUDA programming guide
__ constant __
限定词,可以与__ device __
一起使用,
声明了一个变量:
The
__constant__
qualifier, optionally used together with__device__
, declares a variable that:
- 驻留在恒定的内存空间中,
- 具有生存期
- 可从网格中的所有线程
以及主机通过运行时库
(cudaGetSymbolAddress()
cudaGetSymbolSize()
/cudaMemcpyToSymbol()
/
cudaMemcpyFromSymbol()$ $> API和
cuModuleGetGlobal()
用于
驱动程序API)。
- Resides in constant memory space,
- Has the lifetime of an application,
- Is accessible from all the threads
within the grid and from the host through the runtime library
(
cudaGetSymbolAddress()
/cudaGetSymbolSize()
/cudaMemcpyToSymbol()
/cudaMemcpyFromSymbol()
for the runtime API andcuModuleGetGlobal()
for the driver API).
CUDA,常量存储器是通过高速缓存(为此目的有一组专用的PTX加载指令)访问的专用,静态,全局存储区。 rm和只读,表示正在运行的内核中的所有线程。但是,恒定内存的内容可以在运行时通过使用上面引用的主机端API进行修改。这与使用 const
声明向编译器声明变量不同,后者在声明的范围内为变量添加了只读特征。两者根本不是同一回事。
In CUDA, constant memory is a dedicated, static, global memory area accessed via a cache (there are a dedicated set of PTX load instructions for its purpose) which are uniform and read-only for all threads in a running kernel. But the contents of constant memory can be modified at runtime through the use of the host side APIs quoted above. This is different from declaring a variable to the compiler using the const
declaration, which is adding a read-only characteristic to a variable at the scope of the declaration. The two are not at all the same thing.
这篇关于__device__ __constant__ const的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!