64 位机器中的 C++ int 与 long long [英] C++ int vs long long in 64 bit machine

查看:58
本文介绍了64 位机器中的 C++ int 与 long long的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的电脑有 64 位处理器,当我查找 sizeof(int)sizeof(long)sizeof(long long) 时>,结果是intlong是32位,long long是64位.我研究了原因,似乎流行的假设告诉 C++ 中的 int 适合机器的字大小是错误的.据我了解,由编译器来定义大小,我的是 Mingw-w64.我研究的原因是理解如果使用小于字大小的类型有利于速度(例如,short vs int)或者如果它有负面影响.在 32 位系统中,一种流行的观点是:由于字长是 intshort 将被转换为 int 并且它会导致额外的位移等,从而导致性能下降.反对意见是缓存级别会有好处(我没有深入研究),使用short对虚拟内存经济很有用.所以,除了这种两难之间的困惑,我还面临着另一个问题.我的系统是 64 位的,不管我使用 int 还是 short ,它仍然会小于字号,我开始认为不会使用 64 位 long long 是有效的,因为它处于系统设计的级别.另外我读到还有另一个约束,它是定义类型大小的操作系统的库(ILP64,LP64).在 ILP64 中,默认 int 是 64 位,与 LP64 相比,如果我使用支持 ILP64 的操作系统,它会加速程序吗?当我开始询问应该使用哪种类型来加速我的 C++ 程序时,我面临更深层次的主题,我没有专业知识,而且一些解释似乎相互矛盾.你能解释一下吗:

My computer has 64 bit processor and when I look for sizeof(int), sizeof(long), and sizeof(long long), it turns out that int and long are 32 bits, and long long is 64 bit. I researched the reason, and it appears that popular assumption telling that int in C++ fits machine's word size is wrong. As I understood it is up to compiler to define what will be the size, and mine is Mingw-w64. The reason for my research was understanding that if the usage of types smaller than word size is beneficial for speed(for instance, short vs int) or if it has negative effect. In 32 bit system, one popular opinion is: due to the fact that word size is int, short will be converted into int and it would cause additional bit shifts and etc, thus leading to worse performance. Opposing opinion is that there will be benefit in cache level(I didn't go deep into it), and using short would be usefull for virtual memory economy. So, in addition to confusion between this dilemma, I also face another problem. My system is 64 bit, and it doesn't matter if I use int or short , it still will be less than the word size, and I start thinking that wouldn't it be efficient to use 64 bit long long because it is at the level the system is designed to. Also I read that there is another constraint, which is library(ILP64, LP64) of OS that defines the type sizes. In ILP64 default int is 64 bit in contrast to LP64, would it speed up the program if I use OS with ILP64 support? Once I started to ask which type should I use for speeding up my C++ program, I faced more deep topics in which I have no expertise and some explanations seems to contradict to each other. Can you please explain:

1) 在 x64 中使用 long long 以获得最佳性能是否是最佳实践,即使对于 1-4 字节数据也是如此?

1) If it is best practice to use long long in x64 for achieving maximum performance even for for 1-4 byte data?

2) 使用小于字大小的类型的权衡(内存获胜 vs 附加操作)

2) Trade-off in using a type less than word size(memory win vs additional operations)

3) 是否有 x64 计算机的 word&int 大小为 64 位,是否有可能通过使用所谓的向后兼容性来处理短的,使用 16 位字大小?或者它必须将 16 位文件转换为 64 位文件,并且可以做到这一点的事实将系统定​​义为向后兼容.

3) Does a x64 computer where word&int size is 64 bits, has possibility of processing a short, using 16 bit word size by using so called backward compatibility? Or it must put the 16bit file into 64 bit file, and the fact that it can be done defines the system as backward compatible.

4) 我们可以强制编译器将 int 设为 64 位吗?

4) Can we force the compiler to make the int 64 bit?

5) 如何将 ILP64 集成到使用 LP64 的 PC 中?

5) How to incorporate ILP64 into PC that uses LP64?

6) 使用适用于其他编译器、操作系统和架构(32 位处理器)的上述问题的代码可能会出现哪些问题?

6) What are possible problems of using code adapted to above issues with other compilers, OS's, and architectures(32 bit processor)?

推荐答案

1) 如果在 x64 中使用 long long 以获得最大性能(即使对于 1-4 字节数据)也是最佳实践?

不-而且它实际上可能会使您的表现更糟.例如,如果您使用 64 位整数,而您可以使用 32 位整数,那么您只是将必须在处理器和内存之间发送的数据量增加了一倍,而内存要慢几个数量级.您所有的缓存和内存总线的运行速度都会提高一倍.

No- and it will probably in fact make your performance worse. For example, if you use 64-bit integers where you could have gotten away with 32-bit integers then you have just doubled the amount of data that must be sent between the processor and memory and the memory is orders of magnitude slower. All of your caches and memory buses will crap out twice as fast.

2) 使用小于字大小的类型的权衡(内存获胜与附加操作)

一般来说,现代机器性能的主要驱动因素是需要存储多少数据才能运行程序.一旦程序的工作集大小超过寄存器、L1 缓存、L2 缓存、L3 缓存和 RAM 的容量,您将看到明显的性能悬崖.

Generally, the dominant driver of performance in a modern machine is going to be how much data needs to be stored in order to run a program. You are going to see significant performance cliffs once the working set size of your program exceeds the capacity of your registers, L1 cache, L2 cache, L3 cache, and RAM, in that order.

此外,如果您的编译器足够聪明,能够弄清楚如何使用处理器的向量指令(又名 SSE 指令),那么使用较小的数据类型可能会更胜一筹.现代向量处理单元足够智能,可以将 8 个 16 位短整数与两个 64 位长整数填充到相同的空间中,因此您可以一次执行四倍的运算.

In addition, using a smaller data type can be a win if your compiler is smart enough to figure out how to use your processor's vector instructions (aka SSE instructions). Modern vector processing units are smart enough to cram eight 16-bit short integers into the same space as two 64-bit long long integers, so you can do four times as many operations at once.

3) 字和整数大小为 64 位的 x64 计算机是否有可能通过使用所谓的向后兼容性来处理使用 16 位字大小的短字?或者它必须将16位文件转换成64位文件,并且可以这样做的事实定义了系统向后兼容.

我不确定你在这里问的是什么.一般来说,64 位机器能够执行 32 位和 16 位可执行文件,因为这些早期的可执行文件使用了 64 位机器潜力的一个子集.

I'm not sure what you're asking here. In general, 64-bit machines are capable of executing 32-bit and 16-bit executable files because those earlier executable files use a subset of the 64-bit machine's potential.

硬件指令集通常向后兼容,这意味着处理器设计人员倾向于添加功能,但很少删除功能.

Hardware instruction sets are generally backwards compatible, meaning that processor designers tend to add capabilities, but rarely if ever remove capabilities.

4) 我们可以强制编译器将 int 设为 64 位吗?

所有编译器都有相当标准的扩展,允许您处理固定位大小的数据.例如头文件stdint.h声明了int64_tuint64_t等类型

There are fairly standard extensions for all compilers that allow you to work with fixed-bit-size data. For example, the header file stdint.h declares types such as int64_t, uint64_t, etc.

5) 如何将 ILP64 整合到使用 LP64 的 PC 中?

https://software.intel.com/en-us/node/528682

6) 使用适用于其他编译器、操作系统和架构(32 位处理器)的上述问题的代码可能会出现哪些问题?

通常,编译器和系统足够智能,可以弄清楚如何在任何给定系统上执行您的代码.但是,32 位处理器将不得不做额外的工作来处理 64 位数据.换句话说,正确性不应该成为问题,但性能会成为问题.

Generally the compilers and systems are smart enough to figure out how to execute your code on any given system. However, 32-bit processors are going to have to do extra work to operate on 64-bit data. In other words, correctness should not be an issue, but performance will be.

但通常情况下,如果性能对您来说真的很重要,那么无论如何您都需要针对特定​​的架构和平台进行编程.

But it's generally the case that if performance is really critical to you, then you need to program for a specific architecture and platform anyway.

澄清请求:非常感谢!我想澄清问题:1.你说它对记忆力不好.让我们以 32 位 int 为例.当你将它发送到内存时,因为它是64位系统,对于一个想要的整数0xee ee ee ee,当我们发送它时它会不会变成0x ee ee ee ee+其他32位?当字长为 64 位时,处理器如何发送 32 位?32 位是所需的值,但它不会与 32 位未使用的位组合并以这种方式发送吗?如果我的假设是真的,那么内存没有区别.

这里有两件事要讨论.

首先,你讨论的情况没有发生.处理器不需要将 32 位值提升"为 64 位值才能正确使用它.这是因为现代处理器具有不同的访问模式,能够适当地处理不同大小的数据.

First, the situation you discuss does not occur. A processor does not need to "promote" a 32-bit value into a 64-bit value in order to use it appropriately. This is because modern processors have different accessing modes that are capable of dealing with different size data appropriately.

例如,64 位 Intel 处理器有一个名为 RAX 的 64 位寄存器.但是,通过将其称为 EAX,这个相同的寄存器可以在 32 位模式下使用,甚至可以在 16 位和 8 位模式下使用.我从这里偷了一张图:

For example, a 64-bit Intel processor has a 64-bit register named RAX. However, this same register can be used in 32-bit mode by referring to it as EAX, and even in 16-bit and 8-bit modes. I stole a diagram from here:

x86_64 寄存器 rax/eax/ax/al 覆盖完整的寄存器内容

1122334455667788
================ rax (64 bits)
        ======== eax (32 bits)
            ====  ax (16 bits)
            ==    ah (8 bits)
              ==  al (8 bits)

在编译器和汇编器之间,会生成正确的代码,以便正确处理 32 位值.

Between the compiler and assembler, the correct code is generated so that a 32-bit value is handled appropriately.

第二,当我们谈论内存开销和性能时,我们应该更加具体.现代内存系统由一个磁盘、主内存 (RAM) 和通常两个或三个缓存(例如 L3、L2 和 L1)组成.可以在磁盘上寻址的最小数据量称为,页大小通常为 4096 字节(尽管并非必须如此).然后,可以在内存中寻址的最小数据量称为缓存线,通常比 32 或 64 位大得多.在我的计算机上,缓存行大小为 64 字节.处理器是数据在字级及以下进行实际传输和寻址的唯一场所.

Second, when we're talking about memory overhead and performance we should be more specific. Modern memory systems are composed of a disk, then main memory (RAM) and typically two or three caches (e.g. L3, L2, and L1). The smallest quantity of data that can be addressed on the disk is called a page, and page sizes are usually 4096 bytes (though they don't have to be). Then, the smallest quantity of data that can be addressed in memory is called a cache line, which is usually much larger than 32 or 64 bits. On my computer the cache line size is 64 bytes. The processor is the only place where data is actually transferred and addressed at the word level and below.

因此,如果您想更改驻留在磁盘上的文件中的一个 64 位字,那么在我的计算机上,这实际上需要您将磁盘中的 4096 字节加载到内存中,然后将内存中的 64 字节加载到L3、L2 和 L1 缓存,然后处理器从 L1 缓存中获取单个 64 位字.

So if you want to change one 64-bit word in a file that resides on disk, then, on my computer, this actually requires that you load 4096 bytes from the disk into memory, and then 64 bytes from memory into the L3, L2, and L1 caches, and then the processor takes a single 64-bit word from the L1 cache.

结果是字长对内存带宽没有任何意义.但是,您可以在可以打包 8 个 64 位整数的相同空间中放置 16 个 32 位整数.或者您甚至可以在同一空间中容纳 32 个 16 位值或 64 个 8 位值.如果您的程序使用许多不同的数据值,您可以通过使用所需的最小数据类型来显着提高性能.

The result is that the word size means nothing for memory bandwidth. However, you can fit 16 of those 32-bit integers in the same space you can pack 8 of those 64-bit integers. Or you could even fit 32 16-bit values or 64 8-bit values in the same space. If your program uses a lot of different data values you can significantly improve performance by using the smallest data type necessary.

这篇关于64 位机器中的 C++ int 与 long long的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆