C中的多字加法 [英] Multiword addition in C

查看:218
本文介绍了C中的多字加法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个C程序,它使用gcc的 __ uint128_t 这非常棒,但现在我的需求已经超越了它。我有什么选择196或256位的快速算术?我需要的唯一操作是加法(并且我不需要进位位,也就是说,我将使用mod 2 ^ 192或2 ^ 256)。



速度是重要的,所以我不想在一切可能的情况下转向一般的多重精准。 (事实上​​,我的代码在某些地方确实使用了多精度,但是这是关键循环,并且会运行数百亿次,到目前为止,多精度运算只需要运行数万次。)



也许这很简单,可以直接编码,或者我需要找到一些合适的库。您的建议是什么,O StackOverflow?



澄清:GMP对我的需求来说太慢了。虽然我实际上在我的代码中使用多精度,但它不在内部循环中,运行时间少于10 ^ 5次。热循环运行更像10 ^ 12次。当我改变我的代码(增加一个大小参数),使得多精度部分比单精度运行更频繁时,我有100倍的减速(主要是由于内存管理问题,我认为,而不是额外的μops)。我希望将其降低至4倍放缓或更好。

解决方案

256位版本

  __ uint128_t a [2],b [2],c [2]; // c = a + b 
c [0] = a [0] + b [0];
c [1] = a [1] + b [1] +(c [0]

如果您在循环中多次使用它,您应该考虑通过SIMD和多线程

编辑:192位版本。通过这种方式,您可以消除像@ harold所述的128位比较:

  struct __uint192_t {
__uint128_t H;
__uint64_t L;
} a,b,c; // c = a + b
c.L = a.L + b.L;
c.H = a.H + b.H +(c.L


I have a C program which uses gcc's __uint128_t which is great, but now my needs have grown beyond it. What are my options for fast arithmetic with 196 or 256 bits? The only operation I need is addition (and I don't need the carry bit, i.e., I will be working mod 2^192 or 2^256).

Speed is important, so I don't want to move to a general multiprecision if at all possible. (In fact my code does use multiprecision in some places, but this is in the critical loop and will run tens of billions of times. So far the multiprecision needs to run only tens of thousands of times.)

Maybe this is simple enough to code directly, or maybe I need to find some appropriate library. What is your advice, O great StackOverflow?

Clarification: GMP is too slow for my needs. Although I actually use multiprecision in my code it's not in the inner loop and runs less than 10^5 times. The hot loop runs more like 10^12 times. When I changed my code (increasing a size parameter) so that the multiprecision part ran more often vs. the single-precision, I had a 100-fold slowdown (mostly due to memory management issues, I think, rather than extra µops). I'd like to get that down to a 4-fold slowdown or better.

解决方案

256-bit version

__uint128_t a[2], b[2], c[2];  // c = a + b
c[0] = a[0] + b[0];
c[1] = a[1] + b[1] + (c[0] < a[0]);

If you use it many times in a loop you should consider make it parallel by SIMD and multithreading

Edit: 192-bit version. This way you can eliminate the 128-bit comparison like what @harold's stated:

struct __uint192_t {
    __uint128_t H;
    __uint64_t L;
} a, b, c;  // c = a + b
c.L = a.L + b.L;
c.H = a.H + b.H + (c.L < a.L);

这篇关于C中的多字加法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆