minGW下的差的罗特表现 [英] Poor _rotl performance under minGW

查看:96
本文介绍了minGW下的差的罗特表现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



在MSVC中,它工作得很好,只需定义_rotl()就可以了。内在作为左旋转的目标。



在GCC for Linux下,它也可以很好地工作。这里定义等价的软件结构 rotl32(x,r)=((x << r)|(x>>(32-r))),编译器足够聪明,可以识别出这是一个32位左旋,并自动将其替换为其固有等价物(公平地说,MSVC也能够进行这种检测)。

_rotl
,但不会明显触发相应的内在。软件版本似乎也未被发现,尽管公平,但它比 _rotl 更快。最终结果是性能降低了10倍,所以它显然是非常重要的。

注意:测试MinGW的GCC版本是4.6.2


为了防止你在Windows上固有内存,下面是在x86上使用内联汇编程序的一种方法;

uint32_t rotl32_2(uint32_t x,uint8_t r){
asm(roll%1,%0:+ r(x):c(r));
return x;
}

测试Ubuntu的gcc,但应该在mingw上运行良好。


I've got a program which is performance-reliant on the rotate-left instruction.

Under MSVC, it works fairly well, just define the _rotl() intrinsic as the target for rotate left.

Under GCC for Linux, it also works well. Here it is enough to define the equivalent software construction rotl32(x,r) = ((x << r) | (x >> (32 - r))) , the compiler is clever enough to recognize this is a 32-bits rotate left, and automatically replace it by its intrinsic equivalent (to be fair, MSVC is also able to make such detection).

Under MinGW, not so much. This is all the more intriguing as MinGW is using, at its core, GCC. MinGW can compile the windows intrinsic _rotl, but without apparently triggering the corresponding intrinsic. The software version seems also undetected, although to be fair, it is nonetheless faster than _rotl. The end result is a 10x reduction in performance, so it is definitely significant.

Note : GCC version of tested MinGW is 4.6.2

解决方案

Just in case you're stuck with the intrinsic on Windows, here's a way to do it using inline assembler on x86;

uint32_t rotl32_2(uint32_t x, uint8_t r) {
  asm("roll %1,%0" : "+r" (x) : "c" (r));
  return x;
}

Tested on Ubuntu's gcc, but should work well on mingw.

这篇关于minGW下的差的罗特表现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆