按位运算符,而不是与xor在分支中使用 [英] Bitwise operators, not vs xor use in branching

查看:204
本文介绍了按位运算符,而不是与xor在分支中使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在询问此SO问题


由于您的反汇编代码是为x86编写的,因此我需要验证@ AndonM.Coleman。 ,但值得指出的是,XOR将设置/清除零标志,而NOT不会(如果要执行逐位操作而不影响依赖于先前操作的标志的跳转条件,则有时会很有用)。现在,考虑到你不是直接编写程序集,你真的没有以有意义的方式访问这个标志,所以我怀疑这是支持另一个的原因。


他的评论让我很好奇如果下面的代码会产生相同的汇编指令

  include< iostream> 

int main()
{
unsigned int val = 0;

std :: cout<< 输入数值:;
std :: cin>> val;

if((val ^〜0U)== 0)
{
std :: cout< 值反转为零< std :: endl;
} else
{
std :: cout< 反转的值不为零< std :: endl;
}

if((〜val)== 0)
{
std :: cout< 值反转为零< std :: endl;
} else
{
std :: cout< 反转的值不为零< std :: endl;
}

return 0;
}

对于以下两个操作

  if((val ^〜0U)== 0)


$ b b

  if((〜val)== 0)



Visual Studio 2010中的未优化版本提供了以下反汇编:

  if((val ^〜0U)== 0)
00AD1501 mov eax,dword ptr [val]
00AD1504 xor eax,0FFFFFFFFh
00AD1507 jne main + 86h(0AD1536h)


if((〜val)== 0)
00AD1561 mov eax,dword ptr [val]
00AD1564 not eax
00AD1566 test eax,eax
00AD1568 jne main + 0E7h(0AD1597h)

我的问题关于优化。是最好写

  if((val ^〜0U)== 0)



  if )== 0)


解决方案

这取决于很多



如果编译器设置为针对大小进行优化(最小字节码),那么有时它会在看起来很奇怪的地方使用 XOR 。例如,X86使用的可变长度编码方案可以通过将XOR 设置为 0 的寄存器。使用 MOV 指令。



考虑使用 XOR



  if((val ^〜0U)== 0)/ * 3字节取反并测试)* / 

     < code> XOR eax,0FFFFFFFFh

NOT 的代码:



  if val)== 0)/ * 4字节取反和测试(x86)* / 

;    NOT eax 编码为2字节指令,但不影响CPU标志。



     TEST eax,eax 添加了一个额外的2字节,并且必须设置/清除零标志p>

NOT 也是一个简单的指令,但由于它不影响任何CPU标志,因此必须发出 TEST 指令,然后使用它来进行分支,如代码所示。这实际上产生较大的字节码,因此,为了大小而优化的智能编译器将可能尝试避免使用 NOT 。这些指令一起完成多少个周期在不同的CPU产生之间变化,智能编译器也会将其作为决策的因素,当被告知要优化速度时。





如果你没有编写手动调整的程序集,最好使用对人类最清楚的任何东西,并希望编译器能够聪明地选择不同的指令/调度等。根据编译时的要求优化大小/速度。编译器有一套聪明的启发式方法,用于选择和调度指令,他们比普通编码器更了解目标CPU架构。



如果你以后发现分支真的是一个瓶颈,没有更高级的方法解决问题,那么你可以做一些低级调优。然而,这是一个琐碎的事情,关注这些天,除非你的目标像低功耗嵌入式CPU或内存有限的设备。我唯一的地方,我已经挤出了足够的性能通过手调整让它值得的算法,从数据并行化,编译器不够聪明,不能有效地利用专门的指令集,如MMX / SSE。


After asking this SO question, I received a very interesting comment from @AndonM.Coleman that I had to verify.

Since your disassembled code is written for x86, it is worth pointing out that XOR will set/clear the Zero Flag whereas NOT will not (sometimes useful if you want to perform a bitwise operation without affecting jump conditions that rely on flags from previous operations). Now, considering you're not writing assembly directly, you really have no access to this flag in a meaningful way so I doubt this is the reason for favoring one over the other.

His comment got me curious if the following code would produce the same assembly instructions

#include <iostream>

int main()
{
    unsigned int val = 0;

    std::cout << "Enter a numeric value: ";
    std::cin >> val;

    if ( (val ^ ~0U) == 0)
    {
        std::cout << "Value inverted is zero" << std::endl;
    } else
    {
        std::cout << "Value inverted is not zero" << std::endl;
    }

    if ( (~val) == 0)
    {
        std::cout << "Value inverted is zero" << std::endl;
    } else
    {
        std::cout << "Value inverted is not zero" << std::endl;
    }

    return 0;
}

For the following two operations

if ( (val ^ ~0U) == 0 )

and

if ( (~val) == 0 )

The not optimized build in Visual Studio 2010 gives the following disassembly:

    if ( (val ^ ~0U) == 0)
00AD1501  mov         eax,dword ptr [val]  
00AD1504  xor         eax,0FFFFFFFFh  
00AD1507  jne         main+86h (0AD1536h)  


    if ( (~val) == 0)
00AD1561  mov         eax,dword ptr [val]  
00AD1564  not         eax  
00AD1566  test        eax,eax  
00AD1568  jne         main+0E7h (0AD1597h)  

My question regards optimisation. Is it better to write

if ( (val ^ ~0U) == 0)

or

if ( (~val) == 0)

解决方案

This depends on a lot of things, but mostly what (if anything) you tell the compiler to optimize for.

If the compiler is set to optimize for size (smallest bytecode), then sometimes it will use XOR in seemingly strange places. For instance, the variable length encoding scheme X86 uses can set a register to 0 by XOR'ing itself in fewer bytes of code than would be required using the MOV instruction.

Consider the code that uses XOR:

if ( (val ^ ~0U) == 0 )  /* 3-bytes to negate and test (x86) */

    XOR eax,0FFFFFFFFh requires 3-bytes AND sets/clears the Zero Flag (ZF)

Now, consider the code that uses NOT:

if ( (~val) == 0)        /* 4-bytes to negate and test (x86) */

    NOT eax is encoded into a 2-byte instruction, but does not affect CPU flags.

    TEST eax,eax adds an additional 2-bytes, and is necessary to set/clear the Zero Flag (ZF)

NOT is also a simple instruction, but since it does not affect any CPU flags, you must issue a TEST instruction afterwards to use it for branching as seen in your code. This actually produces larger bytecode, so a smart compiler set to optimize for size would probably try to avoid using NOT. How many cycles both of these instructions together take to complete varies between CPU generation, and a smart compiler would also factor this into its decision making when told to optimize for speed.


If you are not writing hand-tuned assembly, it is best to use whatever is clearest to a human and hope that the compiler is smart enough to choose different instructions/scheduling/etc. to optimize for size/speed as requested at compile-time. Compilers have a smart set of heuristics they use to choose and schedule instructions, they know more about the target CPU architecture than the average coder.

If you find out later that this branch really is a bottleneck and there is no higher-level way around the problem, then you could do some low-level tuning. However, this is such a trivial thing to focus on these days unless you are targeting something like a low-power embedded CPU or memory limited device. The only places I have ever squeezed out enough performance by hand-tuning to make it worthwhile were in algorithms that benefited from data parallelism and where the compiler was not smart enough to effectively utilize specialized instruction sets like MMX/SSE.

这篇关于按位运算符,而不是与xor在分支中使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆