为&lt;比&LT快; =? [英] Is < faster than <=?
问题描述
在这里笔者说,我读了一本书,如果(A&LT; 901)
比如果(A&LT速度更快; = 900 )
。
I'm reading a book where the author says that if( a < 901 )
is faster than if( a <= 900 )
.
不正是在这个简单的例子,但也有循环复杂code轻微的性能变化。我想这已经是与生成的机器code东西的情况下,它甚至真的。
Not exactly as in this simple example, but there are slight performance changes on loop complex code. I suppose this has to do something with generated machine code in case it's even true.
推荐答案
没有,也不会在大多数架构更快。你没有指定,但在x86,所有的积分比较会一般在两个机器指令实现的:
No, it will not be faster on most architectures. You didn't specify, but on x86, all of the integral comparisons will be typically implemented in two machine instructions:
- A
测试
或CMP
指令,该指令集EFLAGS
- 而一个
江铜
(跳转)指令,根据比较类型(和code布局):-
JNE
- 跳转如果不相等 - >ZF = 0
-
JZ
- 跳转如果为零(等于) - >ZF = 1
-
JG
- 跳跃如果大于 - >ZF = 0且SF = OF
- (等)
- A
test
orcmp
instruction, which setsEFLAGS
- And a
Jcc
(jump) instruction, depending on the comparison type (and code layout):jne
- Jump if not equal -->ZF = 0
jz
- Jump if zero (equal) -->ZF = 1
jg
- Jump if greater -->ZF = 0 and SF = OF
- (etc...)
示例与
$ GCC -m32 -S -masm =英特尔编译test.c的(编辑为简洁起见)
if (a < b) { // Do something 1 }
编译为:
mov eax, DWORD PTR [esp+24] ; a cmp eax, DWORD PTR [esp+28] ; b jge .L2 ; jump if a is >= b ; Do something 1 .L2:
和
if (a <= b) { // Do something 2 }
编译为:
mov eax, DWORD PTR [esp+24] ; a cmp eax, DWORD PTR [esp+28] ; b jg .L5 ; jump if a is > b ; Do something 2 .L5:
因此,两者的唯一区别是
JG
与一个JGE
指令。双方将采取相同的时间量。So the only difference between the two is a
jg
versus ajge
instruction. The two will take the same amount of time.我想解决的意见,即没有证据表明,不同的跳转指令需要的时间相同。这一个是有点棘手回答,但这里是我能给:在<一个href=\"http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html\">Intel指令集参考,他们都在一个共同的指令组合在一起,
江铜
(如果满足条件跳转)。同样的分组在<一一起做href=\"http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html\">Optimization参考手册附录C中的延迟和吞吐量。I'd like to address the comment that nothing indicates that the different jump instructions take the same amount of time. This one is a little tricky to answer, but here's what I can give: In the Intel Instruction Set Reference, they are all grouped together under one common instruction,
Jcc
(Jump if condition is met). The same grouping is made together under the Optimization Reference Manual, in Appendix C. Latency and Throughput.<强>延迟强> - 所需要的时钟周期数
执行内核来完成所有形成μops的执行
的指令。Latency — The number of clock cycles that are required for the execution core to complete the execution of all of the μops that form an instruction.
吞吐量 - 需要的时钟周期数
等待面前的问题端口免费接受相同的指令
再次。对于许多指令,指令的吞吐量可以是
比其延迟显著少Throughput — The number of clock cycles required to wait before the issue ports are free to accept the same instruction again. For many instructions, the throughput of an instruction can be significantly less than its latency
为
江铜
的值是:Latency Throughput Jcc N/A 0.5
与
江铜
脚注如下:7)的条件跳转指令的选择应基于部分第3.4.1节,科prediction优化的建议,以提高分支机构的predictability。当分支是成功pdicted $ P $
江铜
实际上为零。潜伏期7) Selection of conditional jump instructions should be based on the recommendation of section Section 3.4.1, "Branch Prediction Optimization," to improve the predictability of branches. When branches are predicted successfully, the latency of
jcc
is effectively zero.因此,没有在英特尔有史以来文档从别人赐予一个
江铜
指令任何不同。So, nothing in the Intel docs ever treats one
Jcc
instruction any differently from the others.如果一个人认为有关用于执行指令的实际电路,可以假设会有简单的和不同的比特/或门在
EFLAGS
,以确定的条件是否得到满足。有那么,没有理由认为测试两个位的指令应该采取任何更多或更少的时间超过一个测试只有一个(忽略门传播延迟,这是比时钟周期少得多。)If one thinks about the actual circuitry used to implement the instructions, one can assume that there would be simple AND/OR gates on the different bits in
EFLAGS
, to determine whether the conditions are met. There is then, no reason that an instruction testing two bits should take any more or less time than one testing only one (Ignoring gate propagation delay, which is much less than the clock period.)编辑:浮点
这为的x87浮点持有也是如此:(pretty很多相同的code如上,但
双击
而不是INT
)This holds true for x87 floating point as well: (Pretty much same code as above, but with
double
instead ofint
.)fld QWORD PTR [esp+32] fld QWORD PTR [esp+40] fucomip st, st(1) ; Compare ST(0) and ST(1), and set CF, PF, ZF in EFLAGS fstp st(0) seta al ; Set al if above (CF=0 and ZF=0). test al, al je .L2 ; Do something 1 .L2: fld QWORD PTR [esp+32] fld QWORD PTR [esp+40] fucomip st, st(1) ; (same thing as above) fstp st(0) setae al ; Set al if above or equal (CF=0). test al, al je .L5 ; Do something 2 .L5: leave ret
这篇关于为&lt;比&LT快; =?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-