小数组像变量一样存储在内核中? [英] Small array stored like variables in a kernel?

查看:90
本文介绍了小数组像变量一样存储在内核中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在OpenCL内核中,我需要使用通常应为4个条目的小数组,但是由于担心该数组的存储方式(可能比常规变量慢得多的内存),而是使用4个单独的变量和一个switch-case语句基于索引访问正确的变量.

In my OpenCL kernel I need to use what should normally be a small array of 4 entries, but because of my concerns over how that array would be stored (probably in a much slower kind of memory than regular variables) I'm instead using 4 separate variables and a switch-case statement to access the correct one based on an index.

有没有办法使4个float4小数组像4个单独的float4变量一样快速无缝地工作?

Is there a way to make a small array of 4 x float4 work as fast and seamlessly as 4 separate float4 variables?

这就是我想要做的:我的内核是通过遍历要应用于v的一系列操作来生成一个float4变量v的.它按顺序运行,并且将列表中的操作之后的操作应用于v,但是在该列表中可以有某种括号/括号,就像在算术隔离中,一组操作要在它们之前被隔离执行.该支架与其他支架一起放回原位的结果.

Here's what I'm trying to do: my kernel is meant to generate a single float4 variable v by going through a list of operations to apply to v. It runs sequentially, with operation after operation in the list being applied to v, however in that list there can be sort of brackets/parentheses, which just like in arithmetic isolate a group of operations for them to be done in isolation before the result of that bracket being brought back in with the rest.

因此,如果正在打开括号,那么我应该将v的值临时存储到v0中(表示括号深度为0时的当前值),然后可以将v重置为0并在括号内执行操作,如果在该括号内还有另一个括号,则我将v放入v1中,以v2v3依次类推,直到更深入嵌套括号为止.这样一来,例如,我可以在方括号内应用乘法运算,这只会影响在该方括号内创建的其他事物,而不会影响其余的事物.

So if a bracket is being opened then I should temporarily store the value of v into let's say v0 (to represent the current value at the bracket depth of 0), then v can be reset to 0 and perform the operations inside the bracket, and if there's yet another bracket inside that bracket I'd put v into v1 and so on with v2 and v3 as we go deeper into nested brackets. This is so that I can for instance apply a multiplication inside a bracket that would only affect the other things created inside that bracket and not the rest.

当方括号闭合时,我会检索例如v3并添加v到最后,所有括号将关闭,并且v代表该系列操作的最终期望值,并将其写入全局缓冲区.使用switch-case语句根据当前括号深度选择正确的变量是可行的,但这是很荒谬的,因为这是数组的用途.因此,我不确定最好的方法是什么.

And once a bracket closes I would retrieve e.g. v3 and add v to it, and in the end all brackets would close and v would represent the final desired value of the series of operations and be written to a global buffer. This is doable using switch-case statements to select the correct variable according to the current bracket depth, but this is quite absurd as this is what arrays are for. So I'm not sure what the best thing to do is.

推荐答案

根据我所看到的,编译器通常会将在private地址空间中声明的小数组直接放入寄存器中.当然,这不是保证,并且可能有不同的参数干预该优化的激活,例如:

From what I've seen, compilers will usually put small arrays declared in the private address space directly in registers. Of course, this is not a guarantee and there are probably different parameters that intervene in the activation of that optimization, such as:

  • 数组大小;
  • 记录压力;
  • 泄漏成本;
  • 还有其他人.

与优化通常一样,唯一可以确保的方法是通过检查生成的程序集来验证编译器正在做什么.

As is usual with optimizations, the only way to be sure is to verify what the compiler is doing by checking the generated assembly.

因此,如果正在打开括号,则应将v的值临时存储到v0中(表示括号深度为0时的当前值),然后可以将v重置为0并在其中执行操作括号,如果在该括号内还有另一个括号,我将v放入v1,以此类推,将v2和v3放到嵌套括号中.这样一来,例如,我可以在方括号内应用乘法运算,这只会影响该方括号内创建的其他事物,而不会影响其余的事物.

So if a bracket is being opened then I should temporarily store the value of v into let's say v0 (to represent the current value at the bracket depth of 0), then v can be reset to 0 and perform the operations inside the bracket, and if there's yet another bracket inside that bracket I'd put v into v1 and so on with v2 and v3 as we go deeper into nested brackets. This is so that I can for instance apply a multiplication inside a bracket that would only affect the other things created inside that bracket and not the rest.

我认为这不会有所帮助.无论如何,编译器都会在范围内进行优化.只需做简单的事情,然后让优化器完成工作即可. 然后,如果您发现代码生成不理想,则可能会开始考虑替代解决方案,但之前不是 .

I don't think that would help. The compiler optimizes across scopes anyway. Just do the straightforward thing and let the optimizer do its job. Then, if you notice suboptimal codegen, you may start thinking about an alternate solution, but not before.

这篇关于小数组像变量一样存储在内核中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆