使用std430限定符进行内存分配 [英] Memory allocation with std430 qualifier

查看:601
本文介绍了使用std430限定符进行内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用与SSAO绑定的计算着色器.我在计算着色器中使用以下结构:

I'm using the compute shader bound with a SSAO. And I use the following structure in the compute shader:

struct Particle                 
{                       
    vec4 pAnds;                     
    vec3 velocity;                  
    float lifespan;             
    float age;                  
};

layout (std430, binding = 0) buffer members_in
{
    Particle particle[];
} input_data;

然而,似乎为每个数据结构分配的内存块不等于(4 + 3 +1 + 1)*4.我还尝试了另一种:

yet it seems that the memory block allocated for each of these data structure is not equal to (4 + 3 + 1 + 1) * 4. And I also tried another one:

struct Particle                 
{                       
    vec4 pAnds;                     
    vec3 velocity;                  
    float lifespan;             
};

这次运行正常.我想知道如何使用std430限定符分配内存.如何使我的第一个数据结构像第二个一样工作?

This time it worked fine. I was wondering how the memory is allocated with std430 qualifier. How to make my first data structure work just as the second one?

已更新: 我将其更改为以下形式:

updated: I changed it to the following form:

struct Particle
{                           
    float px, py, pz, s;
    float vx, vy, vz;
    float lifespan;
    float age;
};

这次运行良好,但是我仍然不知道为什么使用vec4/vec3会有问题.

This time it worked fine, but I still have no idea why there's problem using vec4 / vec3.

推荐答案

根据std430布局规则:

From the std430 Layout Rules:

结构对齐方式与最大结构成员的对齐方式相同,其中三分量向量未四舍五入为四分量向量的大小.每个结构都将从这种对齐方式开始,按照先前的规则,其大小将是其成员所需的空间,四舍五入为结构对齐方式的倍数.

Structure alignment is the same as the alignment for the biggest structure member, where three-component vectors are not rounded up to the size of four-component vectors. Each structure will start on this alignment, and its size will be the space needed by its members, according to the previous rules, rounded up to a multiple of the structure alignment.

vec4的对齐方式是float大小的四倍.

The alignments for vec4 is four times the size of float.

来源:OpenGL编程指南,第8版

Source: OpenGL Programming Guide, 8th Edition

第一个示例:

struct Particle {
    vec4 pAnds;      // 4 * 4 bytes
    vec3 velocity;   // 3 * 4 bytes
    float lifespan;  // 1 * 4 bytes
    float age;       // 1 * 4 bytes
};

结构的最大成员是vec4 pAnds,它具有16个字节的对齐方式.因此,该结构的对齐方式也是16个字节,这意味着,在数组中,每个结构必须从16的倍数开始.为了满足此要求,将在每个结构的末尾附加12个字节的填充.

The biggest member of the structure is vec4 pAnds and it has 16 byte alignment. Therefore the structure's alignment is also 16 bytes, meaning, that inside an array each structure has to start at a position multiple of 16. In order to satisfy that, a 12 byte padding will be appended to the end of each structure.

第二个例子:

struct Particle {
    vec4 pAnds;      // 4 * 4 bytes
    vec3 velocity;   // 3 * 4 bytes
    float lifespan;  // 1 * 4 bytes
};

该结构的对齐方式为16个字节,其大小恰好是该结构的对齐方式的2倍.

The structure has an alignment of 16 bytes and the size of the structure nicely fits into 2 times the alignment of the structure.

第三个示例:

struct Particle {
    float px, py, pz, s;
    float vx, vy, vz;
    float lifespan;
    float age;
};

该结构没有大于float大小的任何元素,因此该结构的对齐方式仅为4个字节.

The structure doesn't have any elements bigger than one size of float, therefore the alignment of the structure is only 4 bytes.

解决方法是插入浮点数数组作为显式填充,或尝试将数据紧密打包为结构对齐的大小倍数.

The workaround could be inserting an array of floats as an explicit padding or trying to pack your data tightly into a size multiple of the structure's alignment.

这篇关于使用std430限定符进行内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆