寄存器总数超过4096的限制 [英] Sum of registers exceeds the limit of 4096
问题描述
尝试编译时收到以下错误.在我看来,这似乎是内存问题,但是代码的哪一部分正是导致这种内存问题的原因?
I received the following errors when trying to compile. This seems like memory problem to me but exactly what part of my code contribute to such memory issue?
输入数组是否大到p_f_e可以处理?
Are the input arrays to big for p_f_e to handle?
我是否必须将数据分成两半并在p_f_e内两次调用run_test()或两次运行p_f_e. 不知道是哪一个,但我希望您能提供帮助.谢谢.
Do I have to divide the data into half and call either: run_test() twice inside the p_f_e or run p_f_e twice. Not sure which one but I would like your help on this. Thank you.
错误C3568:为
编译调用图时,寄存器总数超过了4096的限制.
并发:: parallel_for_each.请简化您的程序< o:p></o:p>
error C3568: sum of registers exceeds the limit of 4096 when compiling the call graph for the
concurrency::parallel_for_each. Please simplify your program<o:p></o:p>
错误C3568:为
编译调用图时,寄存器总数超过了4096的限制.
并发:: parallel_for_each.请简化您的程序< o:p></o:p>
error C3568: sum of registers exceeds the limit of 4096 when compiling the call graph for the
concurrency::parallel_for_each. Please simplify your program<o:p></o:p>
run_test函数使用vX,vY,vZ作为常量输入数组来计算某些值,然后将输出存储到结构infoPack中.
The run_test function uses vX, vY, vZ as constant input arrays to compute certain values and then the output is stored into the structure infoPack.
typedef __declspec(align(4)) struct
{
float X[200];
float Y[200];
float Z[200];
float W[200];
float FRONT[200];
float BACK[200];
float SIDE[200];
} INFO_TYPE;
INFO_TYPE infoPack;
extent<1> ext(2160);
array_view<const float, 2> vX(2160, 60, valueM);
array_view<const float, 2> vY(2160, 60, valueL);
array_view<const float, 2> vZ(2160, 60, valueU);
array_view<INFO_TYPE, 1> pack_view(ext, infoPack);
vX.discard_data();
vY.discard_data();
vZ.discard_data();
parallel_for_each(ext, [=](index<1> idx) restrict(amp)
{
run_test(pack_view, vX[idx[0]], vY[idx[0]], vZ[idx[0]], 60);
});
推荐答案
您也可以发布run_test吗?
Can you post run_test as well?
这篇关于寄存器总数超过4096的限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!