2个64位整数的SSE乘法 [英] SSE multiplication of 2 64-bit integers

查看:341
本文介绍了2个64位整数的SSE乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将两个64位整数乘以另外2个64位整数? 我没有找到任何可以做得到的指令.

How to multiply two 64-bit integers by another 2 64-bit integers? I didn't find any instruction which can do it.

推荐答案

最新答案,但这是Barabas发布的更好的版本.

Late answer, but this is a better version of what Barabas posted.

如果您曾经使用过GCC或Clang的矢量扩展名,这就是他们使用的例程.

If you have ever used GCC or Clang's vector extensions, this is the routine they use.

这与长乘法和网格乘法使用的方法相同.

This uses the same method that long multiplication and grid multiplication use.

    65
  * 73
  ----
    15 //   (5 * 3)
   180 //   (6 * 3) * 10
   350 //   (5 * 7) * 10
+ 4200 // + (6 * 7) * 100
------
  4745

但是,它没有使用10位的每个单位,而是使用32位的每个单位,并且由于最后总是将其移到第64位,所以它省略了最后的乘法,就像您不会乘以6 * 7一样如果您要截断大于99的值.

However, instead of doing each unit of 10, it uses each unit of 32 bits, and it leaves out the last multiply because it will always be shifted past the 64th bit, just like how you wouldn't multiply 6*7 if you were truncating values greater than 99.

#include <emmintrin.h>

/*
 * Grid/long multiply two 64-bit SSE lanes.
 * Works for both signed and unsigned.
 *   ----------------.--------------.----------------.
 *  |                |   b >> 32    | a & 0xFFFFFFFF |
 *  |----------------|--------------|----------------|  
 *  | d >> 32        |   b*d << 64  |    a*d << 32   |
 *  |----------------|--------------|----------------|
 *  | c & 0xFFFFFFFF |   b*c << 32  |       a*c      |
 *  '----------------'--------------'----------------'
 *  Add all of them together to get the product.
 *
 *  Because we truncate the value to 64 bits, b*d << 64 will be zero,
 *  so we can leave it out.
 *
 *  We also can add a*d and b*c first and then shift because of the
 *  distributive property: (a << 32) + (b << 32) == (a + b) << 32.
 */

__m128i Multiply64Bit(__m128i ab, __m128i cd)
{
    /* ac = (ab & 0xFFFFFFFF) * (cd & 0xFFFFFFFF); */
    __m128i ac = _mm_mul_epu32(ab, cd);

    /* b = ab >> 32; */
    __m128i b = _mm_srli_epi64(ab, 32);

    /* bc = b * (cd & 0xFFFFFFFF); */
    __m128i bc = _mm_mul_epu32(b, cd);

    /* d = cd >> 32; */
    __m128i d = _mm_srli_epi64(cd, 32);

    /* ad = (ab & 0xFFFFFFFF) * d; */
    __m128i ad = _mm_mul_epu32(ab, d);

    /* high = bc + ad; */
    __m128i high = _mm_add_epi64(bc, ad);

    /* high <<= 32; */
    high = _mm_slli_epi64(high, 32);

    /* return ac + high; */
    return _mm_add_epi64(high, ac);
}

编译器资源管理器注意:下面还提供了GCC矢量扩展版本以进行比较.

Compiler Explorer Note: The GCC vector extension version was also included below for comparison.

这篇关于2个64位整数的SSE乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆