最快的方式来获得在C / C积极++模 [英] Fastest way to get a positive modulo in C/C++

查看:109
本文介绍了最快的方式来获得在C / C积极++模的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

经常在我的内循环,我需要指数在环绕式的方式数组,因此,如果数组大小是100,我的code要求的元素-2,应给予98元。在许多高级语言如Python,可以做到这一点简单地用 my_array [索引%ARRAY_SIZE] ,但由于某些原因,C'S整数运算(通常)舍趋向于零,而不是一贯四舍五入,因此它的模运算符给出了否定的第一个参数时,返回结果为负。

Often in my inner loops I need to index an array in a "wrap-around" way, so that if the array size is 100 and my code asks for element -2, it should be given element 98. In many high level languages such as Python, one can do this simply with my_array[index % array_size], but for some reason C's integer arithmetic (usually) rounds toward zero instead of consistently rounding down, and consequently its modulo operator returns a negative result when given a negative first argument.

通常我知道首页将不低于 -array_size ,在这种情况下,我只是做 my_array [(指数+ ARRAY_SIZE)%ARRAY_SIZE] 。但是,有时这不能得到保证,而对于那些情况下,我想知道,实现一个始终阳性模函数的最快方式。有一些聪明的方式来做到这一点没有分支,如

Often I know that index will not be less than -array_size, and in these cases I just do my_array[(index + array_size) % array_size]. However, sometimes this can't be guaranteed, and for those cases I would like to know the fastest way to implement an always-positive modulo function. There are several "clever" ways to do it without branching, such as

inline int positive_modulo(int i, int n) {
    return (n + (i % n)) % n
}

inline int positive_modulo(int i, int n) {
    return (i % n) + (n * (i < 0))
}

我当然可以分析这些以找出哪些是最快的我的系统上,但我不能不担心,我可能会错过一个更好的,或者说,什么是快上我的机器可能是在一个不同的慢

Of course I can profile these to find out which is the fastest on my system, but I can't help worrying that I might have missed a better one, or that what's fast on my machine might be slow on a different one.

那么,有没有一个标准的方式来做到这一点,或者一些聪明的把戏,我已经错过了很可能是最快的方法是什么?

So is there a standard way to do this, or some clever trick that I've missed that's likely to be the fastest possible way?

另外,我知道这可能是一厢情愿的想法,但如果有这样做,可以自动矢量化的一种方式,这将是惊人的。

Also, I know it's probably wishful thinking, but if there's a way of doing this that can be auto-vectorised, that would be amazing.

推荐答案

我学到的标准方法是

inline int positive_modulo(int i, int n) {
    return (i % n + n) % n;
}

此功能基本上是你没有 ABS (其中,事实上,使它返回错误结果)第一个变种。我也不会感到惊讶,如果一个优化编译器可以识别这种模式,它被编译成机器code,计算一个无符号模。

This function is essentially your first variant without the abs (which, in fact, makes it return the wrong result). I wouldn't be surprised if an optimizing compiler could recognize this pattern and compile it to machine code that computes an "unsigned modulo".

编辑:

移动到您的第二个变量:首先,它包含了一个bug,太 - N'LT; 0 I&LT; 0

Moving on to your second variant: First of all, it contains a bug, too -- the n < 0 should be i < 0.

这变种可能不会看起来好像树枝,但在很多架构中, I - 的; 0 将编译成一个条件跳转。在任何情况下,这将是至少快更换(N *(I&小于0)) I&LT; 0? N:0 ,避免了乘法;此外,它的清洁剂,因为它避免了reinter preting的布尔为int。

This variant may not look as if it branches, but on a lot of architectures, the i < 0 will compile into a conditional jump. In any case, it will be at least as fast to replace (n * (i < 0)) with i < 0? n: 0, which avoids the multiplication; in addition, it's "cleaner" because it avoids reinterpreting the bool as an int.

至于这两个变种的速度更快,这可能依赖于编译器和处理器架构 - 时间的两种变体和观望。我不认为还有比这两种变体更快的方法,虽然。

As to which of these two variants is faster, that probably depends on the compiler and processor architecture -- time the two variants and see. I don't think there's a faster way than either of these two variants, though.

这篇关于最快的方式来获得在C / C积极++模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆