将值列表传递给片段着色器 [英] Passing a list of values to fragment shader

查看:132
本文介绍了将值列表传递给片段着色器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将值列表发送到片段着色器中.它可能是一个很大的(几千个项目长)单精度浮点数列表.片段着色器需要对该列表进行随机访问,我想刷新每帧上CPU的值.

I want to send a list of values into a fragment shader. It is a possibly large (couple of thousand items long) list of single precision floats. The fragment shader needs random access to this list and I want to refresh the values from the CPU on each frame.

我正在考虑如何完成此操作的选择:

I'm considering my options on how this could be done:

  1. 作为数组类型的统一变量("uniform float x [10];").但是这里似乎有限制,在我的GPU上发送数百个值非常慢,而且当我想在运行时更改它时,我还必须在着色器中硬编码上限.

  1. As a uniform variable of array type ("uniform float x[10];"). But there seems to be limits here, on my GPU sending more than a few hundred values is very slow and also I'd have to hard-code the upper limit in the shader when I'd rather would like to change that in runtime.

作为具有我的列表的高度1和宽度的纹理,然后使用glCopyTexSubImage2D刷新数据.

As a texture with height 1 and width of my list, then refresh the data using glCopyTexSubImage2D.

其他方法?我最近还没有跟上GL规范的所有变化,也许还有其他专门为此目的设计的方法?

Other methods? I haven't kept up with all the changes in the GL-specification lately, perhaps there is some other method that is specifically designed for this purpose?

推荐答案

当前有4种方法可以执行此操作:标准1D纹理,缓冲区纹理,统一缓冲区和着色器存储缓冲区.

There are currently 4 ways to do this: standard 1D textures, buffer textures, uniform buffers, and shader storage buffers.

使用此方法,您可以使用glTex(Sub)Image1D用数据填充一维纹理.由于您的数据只是浮点数的数组,因此您的图像格式应为GL_R32F.然后,您可以通过简单的texelFetch调用在着色器中对其进行访问. texelFetch获取texel坐标(因此命名),并关闭所有过滤.这样就得到了一个纹理像素.

With this method, you use glTex(Sub)Image1D to fill a 1D texture with your data. Since your data is just an array of floats, your image format should be GL_R32F. You then access it in the shader with a simple texelFetch call. texelFetch takes texel coordinates (hence the name), and it shuts off all filtering. So you get exactly one texel.

注意:texelFetch为3.0+.如果要使用以前的GL版本,则需要将尺寸传递给着色器,并手动规范化纹理坐标.

Note: texelFetch is 3.0+. If you want to use prior GL versions, you will need to pass the size to the shader and normalize the texture coordinate manually.

这里的主要优点是兼容性和紧凑性.这将在GL 2.1硬件上运行(使用表示法).而且您没有使用GL_R32F格式;您可以使用GL_R16F半浮点数.或GL_R8(如果您的数据对于标准化字节而言是合理的).尺寸对于整体性能可能意义重大.

The main advantages here are compatibility and compactness. This will work on GL 2.1 hardware (using the notation). And you don't have to use GL_R32F formats; you could use GL_R16F half-floats. Or GL_R8 if your data is reasonable for a normalized byte. Size can mean a lot for overall performance.

主要缺点是尺寸限制.您只能使用最大纹理大小的一维纹理.在GL 3.x级硬件上,此值约为8,192,但保证不低于4,096.

The main disadvantage is the size limitation. You are limited to having a 1D texture of the max texture size. On GL 3.x-class hardware, this will be around 8,192, but is guaranteed to be no less than 4,096.

此方法的工作方式是在着色器中声明一个统一块:

The way this works is that you declare a uniform block in your shader:

layout(std140) uniform MyBlock
{
  float myDataArray[size];
};

然后,您就像在数组中一样在着色器中访问该数据.

You then access that data in the shader just like an array.

返回C/C ++/etc代码,创建一个缓冲区对象,并用浮点数据填充它.然后,您可以将该缓冲区对象与MyBlock统一块关联. 可在此处找到更多详细信息.

Back in C/C++/etc code, you create a buffer object and fill it with floating-point data. Then, you can associate that buffer object with the MyBlock uniform block. More details can be found here.

该技术的主要优点是速度和语义.速度归因于与纹理相比,实现如何处理统一缓冲区.纹理提取是全局内存访问.统一缓冲区访问通常不是;通常在将着色器用于渲染时将其初始化时,将统一缓冲区数据加载到着色器中.从那里开始,它是本地访问,速度更快.

The principle advantages of this technique are speed and semantics. Speed is due to how implementations treat uniform buffers compared to textures. Texture fetches are global memory accesses. Uniform buffer accesses are generally not; the uniform buffer data is usually loaded into the shader when the shader is initialized upon its use in rendering. From there, it is a local access, which is much faster.

从语义上讲,这是更好的方法,因为它不仅是平面数组.对于您的特定需求,如果您只需要float[],就没有关系.但是,如果您具有更复杂的数据结构,则语义可能很重要.例如,考虑一个灯光阵列.灯光具有位置和颜色.如果使用纹理,则用于获取特定光源的位置和颜色的代码如下所示:

Semantically, this is better because it isn't just a flat array. For your specific needs, if all you need is a float[], that doesn't matter. But if you have a more complex data structure, the semantics can be important. For example, consider an array of lights. Lights have a position and a color. If you use a texture, your code to get the position and color for a particular light looks like this:

vec4 position = texelFetch(myDataArray, 2*index);
vec4 color = texelFetch(myDataArray, 2*index + 1);

使用统一缓冲区,它看起来就像其他任何统一访问一样.您已经命名了可以称为positioncolor的成员.这样所有的语义信息就在那里.比较容易了解发生了什么.

With uniform buffers, it looks just like any other uniform access. You have named members that can be called position and color. So all the semantic information is there; it's easier to understand what's going on.

对此也有尺寸限制. OpenGL要求实现为统一块的最大大小提供至少16384个字节.这意味着,对于浮点数组,您只能得到4,096个元素.再次注意,这是实现所需的 minimum ;一些硬件可以提供更大的缓冲区.例如,AMD在其DX10级硬件上提供65,536.

There are size limitations for this as well. OpenGL requires that implementations provide at least 16,384 bytes for the maximum size of uniform blocks. Which means, for float arrays, you get only 4,096 elements. Note again that this is the minimum required from implementations; some hardware can offer much larger buffers. AMD provides 65,536 on their DX10-class hardware, for example.

这些是一种超级1D纹理".它们有效地允许您从纹理单元访问缓冲区对象.尽管它们是一维的,但它们不是一维纹理.

These are kind of a "super 1D texture". They effectively allow you to access a buffer object from a texture unit. Though they are one-dimensional, they are not 1D textures.

您只能在GL 3.0或更高版本中使用它们.而且,您只能通过texelFetch函数访问它们.

You can only use them from GL 3.0 or above. And you can only access them via the texelFetch function.

这里的主要优点是尺寸.缓冲区纹理通常可以非常庞大.虽然规范通常是保守的,规定缓冲区纹理至少为65,536字节,但大多数GL实现方案都允许它们的大小在 mega 字节之内.实际上,通常最大大小通常受可用GPU内存的限制,而不是硬件限制.

The main advantage here is size. Buffer textures can generally be pretty gigantic. While the spec is generally conservative, mandating at least 65,536 bytes for buffer textures, most GL implementations allow them to range in the megabytes in size. Indeed, usually the maximum size is limited by the GPU memory available, not hardware limits.

此外,缓冲区纹理存储在缓冲区对象中,而不是更不透明的纹理对象(如1D纹理)中存储.这意味着您可以使用一些缓冲对象流技术进行更新.

Also, buffer textures are stored in buffer objects, not the more opaque texture objects like 1D textures. This means you can use some buffer object streaming techniques to update them.

这里的主要缺点是性能,就像1D纹理一样.缓冲区纹理可能不会比一维纹理慢,但它们也不会像UBO一样快.如果您只是从它们中拉出一个浮标,则不必担心.但是,如果要从中提取大量数据,请考虑使用UBO.

The main disadvantage here is performance, just like with 1D textures. Buffer textures probably won't be any slower than 1D textures, but they won't be as fast as UBOs either. If you're just pulling one float from them, it shouldn't be a concern. But if you're pulling lots of data from them, consider using a UBO instead.

OpenGL 4.3提供了另一种处理方式:着色器存储缓冲区.它们很像统一缓冲区.您可以使用几乎与统一块相同的语法来指定它们.原则上的区别在于您可以写信给他们.显然,这对您的需求没有用,但是还有其他差异.

OpenGL 4.3 provides another way to handle this: shader storage buffers. They're a lot like uniform buffers; you specify them using syntax almost identical to that of uniform blocks. The principle difference is that you can write to them. Obviously that's not useful for your needs, but there are other differences.

从概念上讲,着色器存储缓冲区是缓冲区纹理的另一种形式.因此,着色器存储缓冲区的大小限制比统一缓冲区的 lot 大.最大UBO大小的OpenGL最小值为16KB.最大SSBO大小的OpenGL最小值为 16MB .因此,如果您拥有硬件,它们是UBO的有趣替代品.

Shader storage buffers are, conceptually speaking, an alternate form of buffer texture. Thus, the size limits for shader storage buffers are a lot larger than for uniform buffers. The OpenGL minimum for the max UBO size is 16KB. The OpenGL minimum for the max SSBO size is 16MB. So if you have the hardware, they're an interesting alternative to UBOs.

请确保将它们声明为readonly,因为您没有写信给他们.

Just be sure to declare them as readonly, since you're not writing to them.

相对于UBO,这里的潜在缺点再次是性能. SSBO通过缓冲区纹理就像图像加载/存储操作一样工作.基本上,它是imageBuffer图像类型周围的(非常好)语法糖.因此,对这些内容的读取可能会以对readonly imageBuffer的读取速度执行.

The potential disadvantage here is performance again, relative to UBOs. SSBOs work like an image load/store operation through buffer textures. Basically, it's (very nice) syntactic sugar around an imageBuffer image type. As such, reads from these will likely perform at the speed of reads from a readonly imageBuffer.

目前尚不清楚通过图像加载读取/通过缓冲区图像存储是比缓冲区纹理更快还是更慢.

Whether reading via image load/store through buffer images is faster or slower than buffer textures is unclear at this point.

另一个潜在的问题是,您必须遵守非同步内存访问的规则.这些很复杂,很容易使您绊倒.

Another potential issue is that you must abide by the rules for non-synchronous memory access. These are complex and can very easily trip you up.

这篇关于将值列表传递给片段着色器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆