对两个不同的向量使用动态共享内存分配 [英] Use dynamic shared memory allocation for two different vectors

查看:118
本文介绍了对两个不同的向量使用动态共享内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在内核函数中,我想要两个共享内存的向量,两个都有 size length(实际上 sizeof(float)* size )。

In kernel function, I want two vectors of shared memory, both with size length (actually sizeof(float)*size).

因为如果需要变量,不可能直接在内核函数中分配内存,所以我必须动态地分配它, p>

Since it is not possible to allocate memory directly in the kernel function if a variable is needed, I had to allocate it dynamically, like:

    myKernel<<<numBlocks, numThreads, 2*sizeof(float)*size>>> (...);  

,在内核中:

extern __shared__ float row[];
extern __shared__ float results[];    

但这不起作用。

而不是这样,我只使用 2 * size 只包含一个包含所有数据的向量 extern __shared__ float rowresults [] $ c>内存分配。所以 row 调用仍然相同,结果调用类似 rowresults [size + previousIndex ] 。这不工作。

Instead of this, I made only one vector extern __shared__ float rowresults[] containing all the data, using the 2*size memory allocated. So row calls are still the same, and results calls are like rowresults[size+previousIndex]. And this does work.

这不是一个大问题,因为我得到我的预期的结果,但有什么方法来分割我的动态分配的共享内存为两个更多)不同的变量?只是为了美。

It is not a big problem because I get my expected results anyway, but is there any way to split my dynamically allocated shared memory into two (or more) different variables? Just for beauty.

推荐答案

__shared__上的C编程指南部分包括从动态分配的共享内存中分配多个数组的示例:

The C Programming guide section on __shared__ includes examples where you allocate multiple arrays from dynamically allocated shared memory:

extern __shared__ float array[];
__device__ void func()      // __device__ or __global__ function
{
    short* array0 = (short*)array; 
    float* array1 = (float*)&array0[128];
    int*   array2 =   (int*)&array1[64];
}

因为你只是得到一个指向一个元素的指针,数组,我相信你可以适应使用动态偏移,而不是他们在示例中的静态偏移。他们还注意到,对齐方式必须相同,这在您的情况不应该是一个问题。

Since you're just getting a pointer to an element and making that a new array, I believe you could adapt that to use dynamic offsets instead of the static offsets they have in the example. They also note that the alignment has to be the same, which shouldn't be an issue in your case.

这篇关于对两个不同的向量使用动态共享内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆