是有可能使用SSE(V2),使128位宽整数? [英] Is it possible to use SSE (v2) to make a 128-bit wide integer?

查看:156
本文介绍了是有可能使用SSE(V2),使128位宽整数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看明白了SSE2的功能多一点,想知道如果一个人可以做一个128位的整数支持加,减,XOR和乘法?谢谢,Erkling。

I'm looking to understand SSE2's capabilities a little more, and would like to know if one could make a 128-bit wide integer that supports addition, subtraction, XOR and multiplication? Thanks, Erkling.

推荐答案

SSE2没有进位,但你可以很容易地计算进为进行= SUM<一个进行= SUM< b 这个。但更糟糕的是,SSE2没有64位的比较,所以你必须在这里使用一些变通办法像一个

SSE2 has no carry flag but you can easily calculate the carry as carry = sum < a or carry = sum < b like this. But worse yet, SSE2 doesn't have 64-bit comparisons too, so you must use some workarounds like the one here

下面是基于以上想法一个未经考验,未优化的C $ C $角

Here is an untested, unoptimized C code based on the idea above.

inline bool lessthan(__m128i a, __m128i b){
    a = _mm_xor_si128(a, _mm_set1_epi32(0x80000000));
    b = _mm_xor_si128(b, _mm_set1_epi32(0x80000000));
    __m128i t = _mm_cmplt_epi32(a, b);
    __m128i u = _mm_cmpgt_epi32(a, b);
    __m128i z = _mm_or_si128(t, _mm_shuffle_epi32(t, 177));
    z = _mm_andnot_si128(_mm_shuffle_epi32(u, 245),z);
    return _mm_cvtsi128_si32(z) & 1;
}

inline __m128i addi128(__m128i a, __m128i b)
{
    __m128i sum = _mm_add_epi64(a, b);
    __m128i mask = _mm_set1_epi64(0x8000000000000000);    
    if (lessthan(_mm_xor_si128(mask, sum), _mm_xor_si128(mask, a)))
    {
        __m128i ONE = _mm_setr_epi64(0, 1);
        sum = _mm_add_epi64(sum, ONE);
    }

    return sum;
}

正如你所看到的,code需要更多的指令,甚至优化后,可能仍比x86_64的(或4 86)一个简单的2指令将/ ADC

As you can see, the code requires many more instructions and even after optimized it may still be much more longer than a simple 2 instruction add/adc in x86_64 (or 4 in x86)

这篇关于是有可能使用SSE(V2),使128位宽整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆