为什么使用带有 int64_t 操作数的 mod 会使这个函数慢 150%? [英] Why does using mod with int64_t operand makes this function 150% slower?
问题描述
max_rem
函数计算 (a+1)^n + (a-1)^n
除以 a²
留下的最大余数n = 1, 2, 3...
的代码>.main
对从 3
到 999
的每个 a
调用 max_rem
.完整代码:
#include #include int max_rem(int a) {int max_r = 0;int m = a * a;//<-------- 违规行int r1 = a+1, r2 = a-1;for(int n = 1; n <= a*a; n++) {r1 = (r1 * (a + 1)) % m;r2 = (r2 * (a - 1)) % m;int r = (r1 + r2) % m;如果(max_r
如果我将第 6 行更改为:
int m = a * a;
到
int64_t m = a * a;
整个计算速度大约慢了 150%.我尝试了 gcc 5.3
和 clang 3.6
.
使用int
:
$ gcc -std=c99 -O3 -Wall -o 120 120.c$时间(./120)真正的 0m3.823s用户 0m3.816s系统 0m0.000s
使用 int64_t
:
$ time(./120)真正的 0m9.861s用户 0m9.836s系统 0m0.000s
是的,我使用的是 64 位系统.为什么会发生这种情况?
我一直认为使用 int64_t
更安全、更便携,并且是编写 C 的现代方法"®,并且不会损害 64 位系统上数字代码的性能.这个假设是错误的吗?
编辑:要明确一点:即使您将每个变量更改为int64_t
,减速仍然存在.所以这不是混合 int
和 int64_t
的问题.
我一直认为使用 int64_t 更安全、更便携,并且是编写 C 的现代方法"®,并且不会损害 64 位系统上数字代码的性能.这个假设是错误的吗?
在我看来是这样.您可以在 英特尔的软件优化参考手册(附录 C,第 645 页上的表 C-17 通用说明):
<前>IDIV r64 吞吐量 每条指令 85-100 个周期IDIV r32 吞吐量 每条指令 20-26 个周期The max_rem
function computes the maximum remainder that (a+1)^n + (a-1)^n
leaves when divided by a²
for n = 1, 2, 3...
. The main
calls max_rem
on every a
from 3
to 999
. Complete code:
#include <inttypes.h>
#include <stdio.h>
int max_rem(int a) {
int max_r = 0;
int m = a * a; // <-------- offending line
int r1 = a+1, r2 = a-1;
for(int n = 1; n <= a*a; n++) {
r1 = (r1 * (a + 1)) % m;
r2 = (r2 * (a - 1)) % m;
int r = (r1 + r2) % m;
if(max_r < r)
max_r = r;
}
return max_r;
}
int main() {
int64_t sum = 0;
for(int a = 3; a < 1000; a++)
sum += max_rem(a);
printf("%ld\n", sum);
}
If I change line 6 from:
int m = a * a;
to
int64_t m = a * a;
the whole computation becames about 150% slower. I tried both with gcc 5.3
and clang 3.6
.
With int
:
$ gcc -std=c99 -O3 -Wall -o 120 120.c
$ time(./120)
real 0m3.823s
user 0m3.816s
sys 0m0.000s
with int64_t
:
$ time(./120)
real 0m9.861s
user 0m9.836s
sys 0m0.000s
and yes, I'm on a 64-bit system. Why does this happen?
I've always assumed that using int64_t
is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?
EDIT: just to be clear: the slowdown persists even if you change every variable to int64_t
. So this is not a problem with mixing int
and int64_t
.
I've always assumed that using int64_t is safer and more portable and "the modern way to write C"® and wouldn't harm performances on 64bits systems for numeric code. Is this assumption erroneous?
It seems so to me. You can find the instruction timings in Intel's Software Optimization Reference manual (appendix C, table C-17 General Purpose Instructions on page 645):
IDIV r64 Throughput 85-100 cycles per instruction IDIV r32 Throughput 20-26 cycles per instruction
这篇关于为什么使用带有 int64_t 操作数的 mod 会使这个函数慢 150%?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!