使用常量内存打印地址而不是cuda中的值 [英] using constant memory prints address instead of value in cuda

查看:190
本文介绍了使用常量内存打印地址而不是cuda中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 #我想在内存中使用常量内存,而不使用cudacopytosymbol。 include< iostream> 
using namespace std;
#define N 10
// __ constant__ int constBuf_d [N];
__constant__ int * constBuf;

__global__ void foo(int * results)
{
int tdx = threadIdx.x;
int idx = blockIdx.x * blockDim.x + tdx;


if(idx< N)
{
constBuf [idx] = 1;
results [idx] = constBuf [idx];
}
}

//在主机上执行的主例程
int main(int argc,char * argv [])
{
int * results_h = new int [N];
int * results_d;


cudaMalloc((void **)& results_d,N * sizeof(int));

foo<<< 1,10>> (results_d);

cudaMemcpy(results_h,results_d,N * sizeof(int),cudaMemcpyDeviceToHost);

for(int i = 0; i printf(%i,results_h [i]);
delete(results_h);
}

输出显示

  6231808 6226116 0 0 0 0 0 0 0 0 



希望程序通过代码中的kenel打印分配给常量内存的值。

解决方案

相对于设备代码的常数/只读。



要设置常量内存中的值,您目前有两个选择:


  1. 通过 cudaMemcpyToSymbol API调用(或其等效项)设置主机代码的值

  2. 在编译时使用静态初始化

在后一种情况下,

  __ constant__ int constBuf [N] = {16,2,77,40,12,3,5,3,6,6} 

__global__ void foo(int * results)
{
int tdx = threadIdx.x;
int idx = blockIdx.x * blockDim.x + tdx;


if(tdx< N)
{
results [idx] = constBuf [tdx]; //注意变化!
}
}


I am trying to use the constant memory in the code with constant memory assigned value from kernel not using cudacopytosymbol.

 #include <iostream>
    using namespace std;
    #define N 10
    //__constant__ int constBuf_d[N];
    __constant__ int *constBuf;

__global__ void foo( int *results )
{
    int tdx = threadIdx.x;
    int idx = blockIdx.x * blockDim.x + tdx;


    if( idx < N )
    {
        constBuf[idx]=1;
         results[idx] = constBuf[idx];
    }
}

// main routine that executes on the host
int main(int argc, char* argv[])
{
    int *results_h = new int[N];
    int *results_d;


    cudaMalloc((void **)&results_d, N*sizeof(int));

    foo <<< 1, 10 >>> ( results_d );

    cudaMemcpy(results_h, results_d, N*sizeof(int), cudaMemcpyDeviceToHost);

    for( int i=0; i < N; ++i )
        printf("%i ", results_h[i] );
        delete(results_h);
}

output shows

6231808 6226116 0 0 0 0 0 0 0 0 

I want the program to print the value assigned to constant memory through the kenel in the code.

解决方案

Constant memory is, as the name implies, constant/read-only with respect to device code. What you are trying to do is illegal and can't be made to work.

To set values in constant memory, you currently have two choices:

  1. set the value from host code via the cudaMemcpyToSymbol API call (or its equivalents)
  2. use static initialisation at compile time

In the latter case something like this would work:

__constant__ int constBuf[N] = { 16, 2, 77, 40, 12, 3, 5, 3, 6, 6 };

__global__ void foo( int *results )
{
    int tdx = threadIdx.x;
    int idx = blockIdx.x * blockDim.x + tdx;


    if( tdx < N )
    {
        results[idx] = constBuf[tdx]; // Note changes here!
    }
}

这篇关于使用常量内存打印地址而不是cuda中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆