如何在汇编中将两个十六进制128位数字相乘 [英] How can I multiply two hex 128 bit numbers in assembly

查看：321 发布时间：2020/5/13 19:19:10 algorithm assembly byte x86-64 multiplication

本文介绍了如何在汇编中将两个十六进制128位数字相乘的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在内存中有两个128位十六进制数字，例如(小尾数):

I have two 128 bit numbers in memory in hexadecimal, for example (little endian):

x:0x12 0x45 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
y:0x36 0xa1 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

我必须在这两个数字之间执行无符号乘法，所以我的新数字将是:

I've to perform the unsigned multiplication between these two numbers so my new number will be:

z:0xcc 0xe3 0x7e 0x2b 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

现在，我知道可以将x和y的一半移到rax和rbx寄存器中，例如，执行mul操作，并对另一半执行相同的操作.问题在于这样做会遗留残留物，我也不知道如何避免这种情况.我正要面对这个问题，大约需要4个小时，我能看到的唯一解决方案是二进制转换(and<-> shl,1).

Now, I'm aware that I can move the half x and y number into rax and rbx registers and, for example, do the mul operation, and do the same with the other half. The problem is that by doing so I lose the carry-over and I've no idea how I can avoid that. It's about 4 hours I'm facing this problem and the only solution that can I see is the conversion in binary (and <-> shl,1).

您能给我一些有关此问题的信息吗?
我认为最好的解决方案是花一个字节的时间.

Can you give me some input about this problem?
I think the best solution is to take one byte par time.

推荐答案

像往常一样，询问编译器如何有效地执行操作:64位平台上的GNU C支持__int128_t和.

As usual, ask a compiler how to do something efficiently: GNU C on 64-bit platforms supports __int128_t and __uint128_t.

__uint128_t mul128(__uint128_t a, __uint128_t b) { return a*b; }

编译为(

compiles to (gcc6.2 -O3 on Godbolt)

    imul    rsi, rdx        # tmp94, b
    mov     rax, rdi  # tmp93, a
    imul    rcx, rdi        # tmp95, a
    mul     rdx       # b
    add     rcx, rsi  # tmp96, tmp94
    add     rdx, rcx  #, tmp96
    ret

由于这是针对x86-64 System V调用约定的，因此a位于RSI:RDI中，而b位于RCX:RDX中. 结果在RDX:RAX中返回.

Since this is targeting the x86-64 System V calling convention, a is in RSI:RDI, while b is in RCX:RDX. The result is returned in RDX:RAX.

很巧的是它只需要一条MOV指令，因为gcc不需要a_upper * b_lower的上半部分结果，反之亦然.由于IMUL仅使用一次，因此可以用更快的2运算符形式的IMUL销毁一半的输入.

Pretty nifty that it only takes one MOV instruction, since gcc doesn't need the high-half result of a_upper * b_lower or vice versa. It can destroy the high halves of the inputs with the faster 2-operand form of IMUL since they're only used once.

使用-march=haswell启用BMI2，gcc使用 MULX 甚至可以避免一个MOV.

With -march=haswell to enable BMI2, gcc uses MULX to avoid even the one MOV.

有时候编译器的输出并不完美，但是通常，一般的策略是手动优化的一个很好的起点.

Sometimes compiler output isn't perfect, but very often the general strategy is a good starting point for optimizing by hand.

当然，如果您最初真正想要的是C语言中的128位乘法，只需使用编译器的内置支持即可.这样一来，优化程序就可以完成工作，通常比在inline-asm中编写几篇文章时提供更好的结果. ( https://gcc.gnu.org/wiki/DontUseInlineAsm ).

Of course, if what you really wanted in the first place was 128-bit multiplies in C, just use the compiler's built-in support for it. That lets the optimizer do its job, often giving better results than if you'd written a couple parts in inline-asm. (https://gcc.gnu.org/wiki/DontUseInlineAsm).

这篇关于如何在汇编中将两个十六进制128位数字相乘的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在汇编中将两个十六进制128位数字相乘 [英] How can I multiply two hex 128 bit numbers in assembly

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在汇编中将两个十六进制128位数字相乘 [英] How can I multiply two hex 128 bit numbers in assembly

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭