C ++ int与64位机器中的long long [英] C++ int vs long long in 64 bit machine

查看:232
本文介绍了C ++ int与64位机器中的long long的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的计算机具有64位处理器,当我查找sizeof(int)sizeof(long)sizeof(long long)时,发现 int long 是32位,而 long long 是64位.我研究了原因,似乎流行的假设表明C ++中的 int 适合机器的字长是错误的.据我了解,取决于编译器来定义大小,而我的是Mingw-w64.我进行研究的原因是要理解,使用小于单词大小的类型是否有利于提高速度(例如, short int )或是否有负面影响.在32位系统中,一种流行的观点是:由于单词大小为 int ,因此 short 会被转换为 int ,导致其他位移位等,从而导致更差的性能.反对意见是缓存级别会有所帮助(我没有深入探讨),并且使用 short 对于虚拟内存的节省将是有用的.因此,除了这个难题之间的困惑之外,我还面临另一个问题.我的系统是64位的,无论我使用 int 还是 short 都没关系,它仍然小于字长,我开始认为不会使用64位 long long 很有效,因为它处于系统设计的级别.我还阅读到还有另一个约束,它是定义类型大小的OS库(ILP64,LP64).与LP64相比,在ILP64中默认的 int 是64位,如果我使用支持ILP64的操作系统,是否可以加快程序的运行速度?一旦我开始问我应该使用哪种类型来加快C ++程序的速度,我就会遇到更深层次的话题,这些话题我没有专门知识,并且一些解释似乎相互矛盾.你能解释一下吗?

My computer has 64 bit processor and when I look for sizeof(int), sizeof(long), and sizeof(long long), it turns out that int and long are 32 bits, and long long is 64 bit. I researched the reason, and it appears that popular assumption telling that int in C++ fits machine's word size is wrong. As I understood it is up to compiler to define what will be the size, and mine is Mingw-w64. The reason for my research was understanding that if the usage of types smaller than word size is beneficial for speed(for instance, short vs int) or if it has negative effect. In 32 bit system, one popular opinion is: due to the fact that word size is int, short will be converted into int and it would cause additional bit shifts and etc, thus leading to worse performance. Opposing opinion is that there will be benefit in cache level(I didn't go deep into it), and using short would be usefull for virtual memory economy. So, in addition to confusion between this dilemma, I also face another problem. My system is 64 bit, and it doesn't matter if I use int or short , it still will be less than the word size, and I start thinking that wouldn't it be efficient to use 64 bit long long because it is at the level the system is designed to. Also I read that there is another constraint, which is library(ILP64, LP64) of OS that defines the type sizes. In ILP64 default int is 64 bit in contrast to LP64, would it speed up the program if I use OS with ILP64 support? Once I started to ask which type should I use for speeding up my C++ program, I faced more deep topics in which I have no expertise and some explanations seems to contradict to each other. Can you please explain:

1)是否最好的做法是在x64中使用 long long 来获得最佳性能,即使对于1-4字节数据也是如此?

1) If it is best practice to use long long in x64 for achieving maximum performance even for for 1-4 byte data?

2)在使用小于单词大小的类型(内存赢与其他操作)之间进行权衡

2) Trade-off in using a type less than word size(memory win vs additional operations)

3)x&int大小为64位的x64计算机是否有可能通过使用所谓的向后兼容来使用16位字长来处理较短的字?否则必须将16位文件转换为64位文件,并且可以完成的事实将系统定​​义为向后兼容.

3) Does a x64 computer where word&int size is 64 bits, has possibility of processing a short, using 16 bit word size by using so called backward compatibility? Or it must put the 16bit file into 64 bit file, and the fact that it can be done defines the system as backward compatible.

4)我们可以强制编译器将 int 设置为64位吗?

4) Can we force the compiler to make the int 64 bit?

5)如何将ILP64集成到使用LP64的PC中?

5) How to incorporate ILP64 into PC that uses LP64?

6)在其他编译器,操作系统和体系结构(32位处理器)上使用与上述问题相适应的代码有哪些可能的问题?

6) What are possible problems of using code adapted to above issues with other compilers, OS's, and architectures(32 bit processor)?

推荐答案

1)如果最佳实践是在x64中使用long long以实现即使对于1-4字节数据也能获得最大性能?

否-实际上,这可能会使您的表现变差.例如,如果使用64位整数,而本来可以避免使用32位整数,那么您将必须在处理器和内存之间发送的数据量增加了一倍,而内存则慢了几个数量级.您所有的缓存和内存总线的处理速度将提高一倍.

No- and it will probably in fact make your performance worse. For example, if you use 64-bit integers where you could have gotten away with 32-bit integers then you have just doubled the amount of data that must be sent between the processor and memory and the memory is orders of magnitude slower. All of your caches and memory buses will crap out twice as fast.

2)在使用小于单词大小的类型(内存赢与其他操作)之间进行权衡

通常,现代机器性能的主要驱动因素将是运行程序需要存储多少数据.一旦程序的工作集大小超过寄存器,L1高速缓存,L2高速缓存,L3高速缓存和RAM的顺序,您将看到显着的性能下降.

Generally, the dominant driver of performance in a modern machine is going to be how much data needs to be stored in order to run a program. You are going to see significant performance cliffs once the working set size of your program exceeds the capacity of your registers, L1 cache, L2 cache, L3 cache, and RAM, in that order.

此外,如果编译器足够聪明,可以弄清楚如何使用处理器的矢量指令(又名SSE指令),则使用较小的数据类型将是一个成功.现代向量处理单元足够聪明,可以将八个16位短整数和两个64位长长整数塞入相同的空间,因此您一次可以执行四倍的操作.

In addition, using a smaller data type can be a win if your compiler is smart enough to figure out how to use your processor's vector instructions (aka SSE instructions). Modern vector processing units are smart enough to cram eight 16-bit short integers into the same space as two 64-bit long long integers, so you can do four times as many operations at once.

3)x&int大小为64位的x64计算机是否有可能通过使用所谓的向后兼容来使用16位字长来处理较短的字?否则必须将16位文件转换为64位文件,并且可以完成的事实将系统定​​义为向后兼容.

我不确定您在这里要问什么.通常,64位计算机能够执行32位和16位可执行文件,因为那些较早的可执行文件使用了64位计算机潜力的一部分.

I'm not sure what you're asking here. In general, 64-bit machines are capable of executing 32-bit and 16-bit executable files because those earlier executable files use a subset of the 64-bit machine's potential.

硬件指令集通常是向后兼容的,这意味着处理器设计人员倾向于添加功能,但是很少删除功能.

Hardware instruction sets are generally backwards compatible, meaning that processor designers tend to add capabilities, but rarely if ever remove capabilities.

4)我们可以强制编译器将int设置为64位吗?

所有编译器都有相当标准的扩展,允许您使用固定位大小的数据.例如,头文件stdint.h声明了诸如int64_tuint64_t等的类型.

There are fairly standard extensions for all compilers that allow you to work with fixed-bit-size data. For example, the header file stdint.h declares types such as int64_t, uint64_t, etc.

5)如何将ILP64集成到使用LP64的PC中?

https://software.intel.com/en-us/node/528682

6)与其他编译器,操作系统和体系结构(32位处理器)一起使用适合上述问题的代码可能会有哪些问题?

通常,编译器和系统足够聪明,可以弄清楚如何在任何给定系统上执行代码.但是,32位处理器将不得不做额外的工作才能对64位数据进行操作.换句话说,正确性不应该成为问题,而是性能.

Generally the compilers and systems are smart enough to figure out how to execute your code on any given system. However, 32-bit processors are going to have to do extra work to operate on 64-bit data. In other words, correctness should not be an issue, but performance will be.

但是通常情况下,如果性能对您而言确实至关重要,那么无论如何您都需要针对特定​​的体系结构和平台进行编程.

But it's generally the case that if performance is really critical to you, then you need to program for a specific architecture and platform anyway.

澄清要求:非常感谢!我想澄清一下问题:1..你说这对记忆不好.让我们以32位int为例.当您将它发送到内存时(因为它是64位系统),对于所需的整数0xee ee ee ee,当我们发送它时,它是否会变成0x ee ee ee ee + 32个其他位?字长为64位时,处理器如何发送32位? 32位是所需的值,但是不会将其与32个未使用的位组合并以这种方式发送吗?如果我的假设是正确的,那么记忆就没有区别.

这里有两件事要讨论.

首先,您讨论的情况不会发生.处理器无需将32位值提升"为64位值即可正确使用它.这是因为现代处理器具有不同的访问模式,能够正确处理不同大小的数据.

First, the situation you discuss does not occur. A processor does not need to "promote" a 32-bit value into a 64-bit value in order to use it appropriately. This is because modern processors have different accessing modes that are capable of dealing with different size data appropriately.

例如,一个64位Intel处理器具有一个名为RAX的64位寄存器.但是,可以通过将其称为EAX来在32位模式下使用该寄存器,甚至在16位和8位模式下也可以使用.我从这里偷了一张图:

For example, a 64-bit Intel processor has a 64-bit register named RAX. However, this same register can be used in 32-bit mode by referring to it as EAX, and even in 16-bit and 8-bit modes. I stole a diagram from here:

x86_64寄存器rax/eax/ax /al覆盖完整的寄存器内容

1122334455667788
================ rax (64 bits)
        ======== eax (32 bits)
            ====  ax (16 bits)
            ==    ah (8 bits)
              ==  al (8 bits)

在编译器和汇编器之间,会生成正确的代码,以便正确处理32位值.

Between the compiler and assembler, the correct code is generated so that a 32-bit value is handled appropriately.

第二,当我们谈论内存开销和性能时,我们应该更加具体.现代存储系统由磁盘,然后是主存储器(RAM)以及通常两个或三个高速缓存(例如L3,L2和L1)组成.可以在磁盘上寻址的最小数据量称为 page ,页面大小通常为4096字节(尽管不必如此).然后,可以在内存中寻址的最小数据量称为缓存行,该行通常比32或64位要大得多.在我的计算机上,缓存行大小为64字节.处理器是唯一实际在字级及以下字级传输和寻址数据的地方.

Second, when we're talking about memory overhead and performance we should be more specific. Modern memory systems are composed of a disk, then main memory (RAM) and typically two or three caches (e.g. L3, L2, and L1). The smallest quantity of data that can be addressed on the disk is called a page, and page sizes are usually 4096 bytes (though they don't have to be). Then, the smallest quantity of data that can be addressed in memory is called a cache line, which is usually much larger than 32 or 64 bits. On my computer the cache line size is 64 bytes. The processor is the only place where data is actually transferred and addressed at the word level and below.

因此,如果您要更改磁盘上驻留的文件中的一个64位字,那么在我的计算机上,这实际上需要您将磁盘中的4096字节加载到内存中,然后将存储器中的64字节加载到内存中. L3,L2和L1缓存,然后处理器从L1缓存中提取单个64位字.

So if you want to change one 64-bit word in a file that resides on disk, then, on my computer, this actually requires that you load 4096 bytes from the disk into memory, and then 64 bytes from memory into the L3, L2, and L1 caches, and then the processor takes a single 64-bit word from the L1 cache.

结果是字大小对于内存带宽没有任何意义.但是,您可以在可以包装8个64位整数的相同空间中容纳16个32位整数.或者,您甚至可以在同一空间中容纳32个16位值或64个8位值.如果您的程序使用许多不同的数据值,则可以通过使用必要的最小数据类型来显着提高性能.

The result is that the word size means nothing for memory bandwidth. However, you can fit 16 of those 32-bit integers in the same space you can pack 8 of those 64-bit integers. Or you could even fit 32 16-bit values or 64 8-bit values in the same space. If your program uses a lot of different data values you can significantly improve performance by using the smallest data type necessary.

这篇关于C ++ int与64位机器中的long long的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆