检查数字是否为偶数 [英] Check if a number is even

查看:73
本文介绍了检查数字是否为偶数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过低级别黑客进行工作,并且希望为每个程序编写一个汇编程序.这是我检查数字是否为偶的内容:

I'm working my way through low level bit hacks, and would like to write an assembly program for each. Here is what I have for checking if a number is even or not:

is_even:
    # check if an integer is even. 
    # This is the same as seeing if its a multiple of two, i.e., & 1<<n - 1
    # rdi stores the number
    xor %eax, %eax
    test $0b1, %rdi
    setz %al
    ret

_start:
    mov $5, %rdi
    call is_even

是否有任何方法可以改善上述内容或使其更具可读性?是否可以用2条指令(而不是3条指令)进行 is_even 检查,因为第一个 xor 和第二个 setz 似乎可能被转换为一个可能的话.

Are there any ways to improve the above or make it more readable? Is it possible to do the is_even check with 2 instructions instead of 3, as the first xor and second setz seems like it might be converted into one if possible.

推荐答案

TL:DR:保证低位加1翻转低位,因此您可以使用 lea /.见下文.

TL:DR: adding 1 flips the low bit, guaranteed, so you can use lea/and. See below.

您选择编写一个返回布尔整数的完整函数,而不仅仅是创建FLAGS条件(大多数代码需要这样做: test $ 1,%dil ,然后完成;分支或cmov或setnz或setz或您实际上想要做的任何事情都基于偶数).

You chose to write a whole function that returns a boolean integer instead of just creating a FLAGS condition (which is all most code needs: test $1, %dil and you're done; branch or cmov or setnz or setz or whatever you actually want to do based one the value being even).

如果您要返回整数,则实际上不需要将条件放入FLAGS并退出,特别是如果您想要宽"字元的话.返回值.x86 setcc 仅写低字节是一种不方便的设计,大多数情况下,您要创建更宽的0/1整数时,需要额外的xor-zeroing指令.(我希望AMD64对设计进行整理,并将该操作码在64位模式下的含义更改为 setcc r/m32 ,但没有.)

If you are going to return an integer, then you don't actually need to get the condition into FLAGS and back out, especially if you want a "wide" return value. x86 setcc only writing the low byte is an inconvenient design that requires an extra xor-zeroing instruction most times you want to create a wider 0 / 1 integer. (I wish AMD64 had tidied up the design and changed the meaning of that opcode for 64-bit mode to setcc r/m32 but they didn't.)

您选择了函数的语义以偶数返回 1 ;这与低位的值相反.(即 return(〜x)& 1; ),您还选择使用x86-64 System V调用约定来创建函数,从而使调用约定产生开销,从而在与寄存器不同的寄存器中使用arg您通过了.

You chose the semantics of your function to return 1 for even; that's opposite of the value of the low bit. (i.e. return (~x)&1;) You also chose to make a function using the x86-64 System V calling convention, imposing overhead from the calling convention taking an arg in a different register than you passed in.

此功能显然太简单了,不值得调用/返回开销;在现实生活中,您只需内联并将其优化到调用方中即可.因此,作为独立功能对其进行优化通常是一个愚蠢的练习,除了获得0/1与原件分开存放,而不破坏原件.

This function is obviously way too trivial to be worth the call/return overhead; in real life you'd just inline and optimize this into the caller. So optimizing it as a stand-alone function is mostly a silly exercise, except for the idea of getting a 0/1 in a separate register from the original without destroying it.

如果我在 https://codegolf.stackexchange.com/上写答案,我会关注

If I was writing an answer on https://codegolf.stackexchange.com/, I'd follow this code-golf tip and choose my calling convention to pass an arg in EAX and return a boolean in AL (like gcc -m32 -mregparm=3 would). Or return a FLAGS condition in ZF. Or if allowed, choose my return semantics such that AL=0 meant even, AL=1 meant odd. Then

# gcc 32-bit regparm calling convention
is_even:          # input in RAX, bool return value in AL
    not   %eax             # 2 bytes
    and   $1, %al          # 2 bytes
    ret

# custom calling convention:
is_even:   # input in RDI
           # returns in ZF.  ZF=1 means even
    test  $1, %dil         # 4 bytes.  Would be 2 for AL, 3 for DL or CL (or BL)
    ret


2条指令而不会破坏输入

is_even:
    lea   1(%rdi), %eax          # flip the low bit
    and   $1, %eax               # and isolate
    ret

XOR是不带进位的加法运算.当进位为零(除ADC之外,这保证了低位),给定位的结果与XOR和加法运算相同.检查真值表/等于门的1位"半加法器(无进位):输出实际上只是XOR,进位输出只是AND.

XOR is add without carry. When the carry-in is zero (guaranteed for the low bit except with ADC), the result for a given bit is the same for XOR and addition. Check the truth table / gate-equivalent for a 1-bit "half adder" (no carry in): the "sum" output is literally just XOR, the carry output is just AND.

(与1的XOR会发生一点翻转,与NOT相同).

(XOR with 1 flips a bit, same as NOT.)

在这种情况下,我们不关心进位或任何高位(因为我们要对与& 1 相同的那些位进行核对),因此我们可以使用LEA作为复制和添加来翻转低位.

In this case, we don't care about the carry-out or any of the high bits (because we're about to nuke those bits with & 1 are the same operation), so we can use LEA as a copy-and-add to flip the low bit.

使用XOR而不是ADD或SUB对于SIMD很有用,在此之前, pxor 可以在比 paddb psubb 更高的端口上运行.Skylake.如果要对 pcmpgtb 或其他内容进行无符号范围转换,请添加 -128 ,但这与翻转每个字节的高位相同

Using XOR instead of ADD or SUB is useful for SIMD, where pxor can run on more ports than paddb or psubb on CPUs before Skylake. When you want to range-shift unsigned to signed for pcmpgtb or something, you want to add -128, but that's the same thing as just flipping the high bit of each byte.

您可以使用它来翻转更高的位,例如 lea 8(%rdi),%eax 将翻转 1<< 3 位的位置(并可能携带到所有更高的位中).我们知道该位的进位为零,因为 x + 0 不进位,并且 8 的低3位全为0.

You could use this to flip a higher bit, e.g. lea 8(%rdi), %eax would flip the 1<<3 bit position (and potentially carry into all higher bits). We know the carry-in to that bit will be zero because x + 0 doesn't carry, and the low 3 bits of 8 are all 0.

(此想法对于 https://catonmat.net/low-level-bit-hacks )

这篇关于检查数字是否为偶数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆