我应该在统一缓冲区或着色器存储缓冲区对象中使用 `vec3` 吗? [英] Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

查看:32
本文介绍了我应该在统一缓冲区或着色器存储缓冲区对象中使用 `vec3` 吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

vec3 类型是一个非常好的类型.它只占用 3 个浮点数,而我的数据只需要 3 个浮点数.我想在 UBO 和/或 SSBO 的结构中使用一个:

layout(std140) 统一 UBO{vec4 数据1;vec3 数据2;浮动数据3;};布局(std430)缓冲区SSBO{vec4 数据1;vec3 数据2;浮动数据3;};

然后,在我的 C 或 C++ 代码中,我可以这样做来创建匹配的数据结构:

struct UBO{vector4 data1;vector3 data2;浮动数据3;};结构SSBO{vector4 data1;vector3 data2;浮动数据3;};

这是个好主意吗?

解决方案

NO! 千万不要这样做!

在声明 UBO/SSBO 时,假设所有 3 元素向量类型不存在.这包括具有 3 行的列主矩阵或具有 3 列的行主矩阵.假设唯一的类型是标量、2 和 4 元素向量(和矩阵).如果你这样做,你会为自己省去很多悲伤.

如果你想要一个 vec3 + a float 的效果,那么你应该手动打包:

layout(std140) 统一 UBO{vec4 数据1;vec4 data2and3;};

是的,您必须使用 data2and3.w 来获取另一个值.处理它.

如果您想要 vec3 的数组,则将它们设为 vec4 的数组.使用 3 元素向量的矩阵也是如此.只需从您的 SSBO/UBO 中消除 3 元素向量的整个概念;从长远来看,你会过得更好.

避免使用 vec3 的原因有两个:

它不会做 C/C++ 所做的

如果您使用 std140 布局,那么您可能希望在 C 或 C++ 中定义与 GLSL 中的定义相匹配的数据结构.这使得两者之间的混合和匹配变得容易.并且 std140 布局至少在大多数情况下可以做到这一点.但是当涉及到 vec3s 时,它的布局规则与 C 和 C++ 编译器通常的布局规则不匹配.

考虑以下 vec3 类型的 C++ 定义:

struct vec3a { float a[3];};struct vec3f { 浮动 x, y, z;};

这两个都是完全合法的类型.这些类型的 sizeof 和布局将匹配 std140 所需的大小和布局.但它与 std140 强加的对齐行为不匹配.

考虑一下:

//GLSL布局(std140)统一块{vec3 a;vec3 b;} 堵塞;//C++结构体块_a{vec3a;vec3a b;};结构块_f{vec3f 一个;vec3f b;};

在大多数 C++ 编译器上,Block_aBlock_fsizeof 都是 24.这意味着 offsetof> b 将是 12.

然而,在 std140 布局中,vec3 总是与 4 个字对齐.因此,Block.b 的偏移量为 16.

现在,您可以尝试使用 C++11 的 alignas 功能(或 C11 的类似 _Alignas 功能)来解决这个问题:

struct alignas(16) vec3a_16 { float a[3];};struct alignas(16) vec3f_16 { 浮动 x, y, z;};结构体块_a{vec3a_16 a;vec3a_16 b;};结构块_f{vec3f_16 a;vec3f_16 b;};

如果编译器支持 16 字节对齐,这将起作用.或者至少,它适用于 Block_aBlock_f.

但它在这种情况下不会工作:

//GLSL布局(std140)块2{vec3 a;浮动 b;} 块2;//C++结构体 Block2_a{vec3a_16 a;浮动 b;};结构体 Block2_f{vec3f_16 a;浮动 b;};

根据std140 的规则,每个vec3 必须开始在一个16 字节的边界上.但是vec3 不会消耗 16 字节的存储空间;它只消耗 12 个字节.而且由于 float 可以从 4 字节边界开始,一个 vec3 后跟一个 float 将占用 16 个字节.

但是C++对齐规则不允许这样的事情.如果类型与 X 字节边界对齐,则使用该类型将消耗 X 字节的倍数.

因此匹配 std140 的布局要求您根据它的确切使用位置选择一种类型.如果后面跟着float,则必须使用vec3a;如果后面跟着超过 4 字节对齐的某种类型,则必须使用 vec3a_16.

或者您可以在着色器中不使用 vec3 并避免所有这些增加的复杂性.

注意基于 alignas(8)vec2 不会有这个问题.C/C++ 结构和数组也不会使用正确的对齐说明符(尽管较小类型的数组有其自身的问题).这个问题在使用裸vec3时出现.

实现支持模糊

即使你做的一切都正确,但众所周知,实现会错误地实现 vec3 的奇怪布局规则.一些实现有效地将 C++ 对齐规则强加给 GLSL.因此,如果您使用 vec3,它会像 C++ 对待 16 字节对齐类型一样对待它.在这些实现中,vec3 后跟 float 将像 vec4 后跟 float 一样工作.>

是的,这是实施者的错.但是由于您无法修复实现,您必须解决它.最合理的方法是完全避免 vec3.

请注意,对于 Vulkan(以及使用 SPIR-V 的 OpenGL),SDK 的 GLSL 编译器可以正确执行此操作,因此您无需为此担心.

The vec3 type is a very nice type. It only takes up 3 floats, and I have data that only needs 3 floats. And I want to use one in a structure in a UBO and/or SSBO:

layout(std140) uniform UBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

layout(std430) buffer SSBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

Then, in my C or C++ code, I can do this to create matching data structures:

struct UBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

struct SSBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

Is this a good idea?

解决方案

NO! Never do this!

When declaring UBOs/SSBOs, pretend that all 3-element vector types don't exist. This includes column-major matrices with 3 rows or row-major matrices with 3 columns. Pretend that the only types are scalars, 2, and 4 element vectors (and matrices). You will save yourself a very great deal of grief if you do so.

If you want the effect of a vec3 + a float, then you should pack it manually:

layout(std140) uniform UBO
{
  vec4 data1;
  vec4 data2and3;
};

Yes, you'll have to use data2and3.w to get the other value. Deal with it.

If you want arrays of vec3s, then make them arrays of vec4s. Same goes for matrices that use 3-element vectors. Just banish the entire concept of 3-element vectors from your SSBOs/UBOs; you'll be much better off in the long run.

There are two reasons why you should avoid vec3:

It won't do what C/C++ does

If you use std140 layout, then you will probably want to define data structures in C or C++ that match the definition in GLSL. That makes it easy to mix&match between the two. And std140 layout makes it at least possible to do this in most cases. But its layout rules don't match the usual layout rules for C and C++ compilers when it comes to vec3s.

Consider the following C++ definitions for a vec3 type:

struct vec3a { float a[3]; };
struct vec3f { float x, y, z; };

Both of these are perfectly legitimate types. The sizeof and layout of these types will match the size&layout that std140 requires. But it does not match the alignment behavior that std140 imposes.

Consider this:

//GLSL
layout(std140) uniform Block
{
    vec3 a;
    vec3 b;
} block;

//C++
struct Block_a
{
    vec3a a;
    vec3a b;
};

struct Block_f
{
    vec3f a;
    vec3f b;
};

On most C++ compilers, sizeof for both Block_a and Block_f will be 24. Which means that the offsetof b will be 12.

In std140 layout however, vec3 is always aligned to 4 words. And therefore, Block.b will have an offset of 16.

Now, you could try to fix that by using C++11's alignas functionality (or C11's similar _Alignas feature):

struct alignas(16) vec3a_16 { float a[3]; };
struct alignas(16) vec3f_16 { float x, y, z; };

struct Block_a
{
    vec3a_16 a;
    vec3a_16 b;
};

struct Block_f
{
    vec3f_16 a;
    vec3f_16 b;
};

If the compiler supports 16-byte alignment, this will work. Or at least, it will work in the case of Block_a and Block_f.

But it won't work in this case:

//GLSL
layout(std140) Block2
{
    vec3 a;
    float b;
} block2;

//C++
struct Block2_a
{
    vec3a_16 a;
    float b;
};

struct Block2_f
{
    vec3f_16 a;
    float b;
};

By the rules of std140, each vec3 must start on a 16-byte boundary. But vec3 does not consume 16 bytes of storage; it only consumes 12. And since float can start on a 4-byte boundary, a vec3 followed by a float will take up 16 bytes.

But the rules of C++ alignment don't allow such a thing. If a type is aligned to an X byte boundary, then using that type will consume a multiple of X bytes.

So matching std140's layout requires that you pick a type based on exactly where it is used. If it's followed by a float, you have to use vec3a; if it's followed by some type that is more than 4 byte aligned, you have to use vec3a_16.

Or you can just not use vec3s in your shaders and avoid all this added complexity.

Note that an alignas(8)-based vec2 will not have this problem. Nor will C/C++ structs&arrays using the proper alignment specifier (though arrays of smaller types have their own issues). This problem only occurs when using a naked vec3.

Implementation support is fuzzy

Even if you do everything right, implementations have been known to incorrectly implement vec3's oddball layout rules. Some implementations effectively impose C++ alignment rules to GLSL. So if you use a vec3, it treats it like C++ would treat a 16-byte aligned type. On these implementations, a vec3 followed by a float will work like a vec4 followed by a float.

Yes, it's the implementers' fault. But since you can't fix the implementation, you have to work around it. And the most reasonable way to do that is to just avoid vec3 altogether.

Note that, for Vulkan (and OpenGL using SPIR-V), the SDK's GLSL compiler gets this right, so you don't need to be worried about it for that.

这篇关于我应该在统一缓冲区或着色器存储缓冲区对象中使用 `vec3` 吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆