Async_work_group_copy float4浮动 [英] Async_work_group_copy float4 to float
本文介绍了Async_work_group_copy float4浮动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
可以将float数组复制到float4吗?
我不知道float4数组元素是否可以与float数组对齐以复制它们。
我尝试了这个但是没能编译:
我尝试过:
#define WD2 WIDTH / 4
__global float A [WIDTH * HEIGHT];
...
__local float4 B [WD2];
barrier(CLK_LOCAL_MEM_FENCE);
async_work_group_copy(B,A + j * WIDTH,WD2,0);
解决方案
最后vloadn有效,但不幸的是我必须复制到只有一个向量,而不是它们的数组:
__ global float * imagen0
...
long pix = get_global_id(0) ;
if(pix == 0)
{
float16 vv = vload16(0,imagen0);
printf(=== GPU vv:%6v16f \ n,vv);
}
if(pix< 16)
printf(imagen0 [%2i] =%6f,pix,imagen0 [pix]);
It is possible to copy float arrays to float4?
I do not know if a float4 array elements can be aligned with a float array to copy them.
I tried this but failed to compile:
What I have tried:
#define WD2 WIDTH/4 __global float A[WIDTH*HEIGHT]; ... __local float4 B[WD2]; barrier(CLK_LOCAL_MEM_FENCE); async_work_group_copy(B,A+j*WIDTH,WD2,0);
解决方案
Finally vloadn works, but unfortunately I have to copy to only one vector, not an array of them:
__global float* imagen0 ... long pix = get_global_id(0); if (pix==0) { float16 vv=vload16(0,imagen0); printf("===GPU vv: %6v16f \n",vv); } if (pix<16) printf("imagen0[%2i]=%6f",pix,imagen0[pix]);
这篇关于Async_work_group_copy float4浮动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文