C/C ++中的内存对齐 [英] Memory Alignment in C/C++

查看:78
本文介绍了C/C ++中的内存对齐的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读《游戏编码完整版》第4版.有一个关于内存对齐的主题.在下面的代码中,作者说,第一个结构确实很慢,因为它既不是位对齐的也不是字节对齐的.第二个不是位对齐而是字节对齐.最后一个很快,因为两者都是.他说,没有编译指示,编译器将对齐内存本身,这会导致内存浪费.我真的无法获得计算结果.

I was reading Game Coding Complete 4th edition. There was a topic regarding Memory alignment. In the code below the author says that first struct is really slow because it is both not bit-aligned nor byte-aligned. The second one is not bit-aligned but byte-aligned. The last one is fast because it's both. He says without pragma, compiler will align the memory itself which causes waste of memory. I couldn't really get the calculations.

这是文本的一部分:-

如果让编译器通过添加未使用来优化SlowStruct 字节,每个结构将是24个字节,而不是14个字节.七个 在第一个char变量之后填充多余的字节,并且 其余字节将添加到末尾.这样可以确保整个 结构总是从8字节边界开始.大约40% 浪费的空间,这全都归因于成员变量的粗心排序.

If the compiler were left to optimize SlowStruct by adding unused bytes, each structure would be 24 bytes instead of just 14. Seven extra bytes are padded after the first char variable, and the remaining bytes are added at the end. This ensures that the entire structure always starts on an 8-byte boundary. That’s about 40 percent of wasted space, all due to a careless ordering of member variables.

这是粗体字的结尾行:-

This is the concluding line in bolds:-

不要让编译器浪费宝贵的内存空间.放一些你的 脑细胞可以工作并对齐您自己的成员变量.

请向我展示计算结果,并更清楚地说明填充概念.

Please show me calculations and explain the padding concept more clearly.

代码:-

#pragma pack(push, 1)
struct ReallySlowStruct
{
    char c : 6;
    __int64 d : 64;
    int b : 32;
    char a : 8;
};

struct SlowStruct
{
    char c;
    __int64 d;
    int b;
    char a;
};

struct FastStruct
{
   __int64 d;
   __int b;
   char a;
   char c;
   char unused[2];
};
#pragma pack(pop)

推荐答案

本书中的示例高度依赖于所使用的编译器和计算机体系结构.如果在自己的程序中测试它们,则可能会得到与作者完全不同的结果.我将假定使用64位体系结构,因为作者也会这样做,根据我在说明书中所读的内容. 让我们一一看一下示例:

The examples given in the book are highly dependent on the used compiler and computer architecture. If you test them in your own program you may get totally different results than the author. I will assume a 64-bit architecture, because the author does also, from what I've read in the description. Lets look at the examples one by one:

ReallySlowStruct 如果使用的编译器支持非字节对齐的结构成员,则"d"的开头将在结构的第一个字节的第七位.听起来非常节省内存.问题是,C不允许位处理.因此,要将newValue保存到"d"成员,编译器必须执行很多移位操作:将"newValue"的前两位保存在byte0中,向右移6位.然后将"newValue"向左移两位,并从字节1开始保存.字节1是未对齐的存储器位置,这意味着大容量存储器传输指令将不起作用,编译器必须一次保存每个字节.

ReallySlowStruct IF the used compiler supports non-byte aligned struct members, the start of "d" will be at the seventh bit of the first byte of the struct. Sounds very good for memory saving. The problem with this is, that C does not allow bit-adressing. So to save newValue to the "d" member, the compiler must do a whole lot of bit shifting operations: Save the first two bits of "newValue" in byte0, shifted 6 bits to the right. Then shift "newValue" two bits to the left and save it starting at byte 1. Byte 1 is a non-aligned memory location, that means the bulk memory transfer instructions won't work, the compiler must save every byte at a time.

慢速结构 它变得更好了.编译器可以摆脱所有麻烦.但是写入"d"仍然需要一次写入每个字节,因为它与本机"int"的大小不匹配.在64位系统上,本机大小为8.因此,每个不可被8整除的内存地址一次只能访问一个字节.更糟糕的是,如果我关闭打包,则会浪费很多内存空间:每个成员后面跟一个int都将填充足够的字节,以使整数从8的可除内存位置开始.在这种情况下:char a和c都将占用8个字节.

SlowStruct It gets better. The compiler can get rid of all the bit-fiddling. But writing "d" will still require writing every byte at a time, because it is not aligned to the native "int" size. The native size on a 64-bit system is 8. so every memory address not divisable by 8 can only be accessed one byte at a time. And worse, if I switch off packing, I will waste a lot of memory space: every member which is followed by an int will be padded with enough bytes to let the integer start at a memory location divisable by 8. In this case: char a and c will both take up 8 bytes.

快速结构 这与目标计算机上int的大小对齐. "b"占用了8个字节.因为所有字符都捆绑在一个位置,所以编译器不会填充它们,也不会浪费空间.字符每个仅1个字节,因此我们不需要填充它们.完整的结构总计为16个字节.可除以8,因此无需填充.

FastStruct this is aligned to the size of int on the target machine. "b" takes up 8 bytes as it should. Because the chars are all bundled at one place, the compiler does not pad them and does not waste space. chars are only 1 byte each, so we do not need to pad them. The complete structure adds up to a overall size of 16 bytes. Divisable by 8, so no padding needed.

这篇关于C/C ++中的内存对齐的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆