如果32位整数溢出,我们可以使用40位结构而不是64位长的结构吗? [英] If a 32-bit integer overflows, can we use a 40-bit structure instead of a 64-bit long one?

查看:109
本文介绍了如果32位整数溢出,我们可以使用40位结构而不是64位长的结构吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,如果32位整数溢出,而不是将int升级为long,如果我们只需要2 40以内的范围,我们可以使用40位类型吗?

sup>,以便我们为每个整数保存24(64-40)位?

如果是,怎么办?

我必须应付数十亿美元,而空间是一个更大的约束.

解决方案

是,但是...

这肯定是可能的,但这通常是荒谬的(对于任何不使用这些数字中的十亿的程序):

#include <stdint.h> // don't want to rely on something like long long
struct bad_idea
{
    uint64_t var : 40;
};

在这里,var确实具有40位的宽度,但会以 much 效率较低的代码为代价(事实证明,"much"是非常错误的-测量的开销是仅为1-2%,请参见下面的计时),通常无济于事.除非您需要将另一个24位值(或8位和16位值)打包成相同的结构,否则对齐将丧失您可能获得的任何东西.

无论如何,除非您有数十亿个内存消耗,否则有效的内存消耗差异将不会很明显(但是管理位字段所需的额外代码将非常明显!).

注意:
同时,该问题已更新,以反映确实需要数十亿个数字,因此这可能是可行的,前提是您要采取措施不因结构对齐而损失收益和填充,即通过在剩余的24位中存储其他内容或通过将40位的值存储在每个8或其倍数的结构中).
节省三个字节<十亿次是值得的,因为它将需要显着更少的内存页面,从而导致更少的缓存和TLB丢失,尤其是所有页面错误(单个页面错误权重数千万条指令). /p>

虽然以上代码段未使用剩余的24位(仅演示了使用40位"部分),但要使该方法在保留内存的意义上真正有用,则需要类似于以下内容的方法- -假设您确实还有其他有用"的数据需要解决:

struct using_gaps
{
    uint64_t var           : 40;
    uint64_t useful_uint16 : 16;
    uint64_t char_or_bool  : 8;  
};

结构大小和对齐方式将等于64位整数,因此如果您进行例如十亿个这样的结构的数组(即使不使用编译器专有的扩展).如果不使用8位值,则也可以使用48位和16位值(产生更大的溢出余量).
或者,您可以牺牲可用性来将8个40位值放入一个结构中(40和64的最小公倍数为320 = 8 * 40).当然,访问结构数组元素的代码会变得更复杂(尽管可能可以实现operator[]来恢复线性数组功能并隐藏结构复杂性). >

更新:
编写了一个快速测试套件,目的只是为了了解位域的开销(以及运算符对位域引用的重载).在 解决方案

Yes, but...

It is certainly possible, but it is usually nonsensical (for any program that doesn't use billions of these numbers):

#include <stdint.h> // don't want to rely on something like long long
struct bad_idea
{
    uint64_t var : 40;
};

Here, var will indeed have a width of 40 bits at the expense of much less efficient code generated (it turns out that "much" is very much wrong -- the measured overhead is a mere 1-2%, see timings below), and usually to no avail. Unless you have need for another 24-bit value (or an 8 and 16 bit value) which you wish to pack into the same structure, alignment will forfeit anything that you may gain.

In any case, unless you have billions of these, the effective difference in memory consumption will not be noticeable (but the extra code needed to manage the bit field will be noticeable!).

Note:
The question has in the mean time been updated to reflect that indeed billions of numbers are needed, so this may be a viable thing to do, presumed that you take measures not to lose the gains due to structure alignment and padding, i.e. either by storing something else in the remaining 24 bits or by storing your 40-bit values in structures of 8 each or multiples thereof).
Saving three bytes a billion times is worthwhile as it will require noticeably fewer memory pages and thus cause fewer cache and TLB misses, and above all page faults (a single page fault weighting tens of millions instructions).

While the above snippet does not make use of the remaining 24 bits (it merely demonstrates the "use 40 bits" part), something akin to the following will be necessary to really make the approach useful in a sense of preserving memory -- presumed that you indeed have other "useful" data to put in the holes:

struct using_gaps
{
    uint64_t var           : 40;
    uint64_t useful_uint16 : 16;
    uint64_t char_or_bool  : 8;  
};

Structure size and alignment will be equal to a 64 bit integer, so nothing is wasted if you make e.g. an array of a billion such structures (even without using compiler-specific extensions). If you don't have use for an 8-bit value, you could also use an 48-bit and a 16-bit value (giving a bigger overflow margin).
Alternatively you could, at the expense of usability, put 8 40-bit values into a structure (least common multiple of 40 and 64 being 320 = 8*40). Of course then your code which accesses elements in the array of structures will become much more complicated (though one could probably implement an operator[] that restores the linear array functionality and hides the structure complexity).

Update:
Wrote a quick test suite, just to see what overhead the bitfields (and operator overloading with bitfield refs) would have. Posted code (due to length) at
gcc.godbolt.org, test output from my Win7-64 machine is:

Running test for array size = 1048576
what       alloc   seq(w)  seq(r)  rand(w)  rand(r)  free
-----------------------------------------------------------
uint32_t    0      2       1       35       35       1
uint64_t    0      3       3       35       35       1
bad40_t     0      5       3       35       35       1
packed40_t  0      7       4       48       49       1


Running test for array size = 16777216
what        alloc  seq(w)  seq(r)  rand(w)  rand(r)  free
-----------------------------------------------------------
uint32_t    0      38      14      560      555      8
uint64_t    0      81      22      565      554      17
bad40_t     0      85      25      565      561      16
packed40_t  0      151     75      765      774      16


Running test for array size = 134217728
what        alloc  seq(w)  seq(r)  rand(w)  rand(r)  free
-----------------------------------------------------------
uint32_t    0      312     100     4480     4441     65
uint64_t    0      648     172     4482     4490     130
bad40_t     0      682     193     4573     4492     130
packed40_t  0      1164    552     6181     6176     130

What one can see is that the extra overhead of bitfields is neglegible, but the operator overloading with bitfield reference as a convenience thing is rather drastic (about 3x increase) when accessing data linearly in a cache-friendly manner. On the other hand, on random access it barely even matters.

These timings suggest that simply using 64-bit integers would be better since they are still faster overall than bitfields (despite touching more memory), but of course they do not take into account the cost of page faults with much bigger datasets. It might look very different once you run out of physical RAM (I didn't test that).

这篇关于如果32位整数溢出,我们可以使用40位结构而不是64位长的结构吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆