最快的办法,找出3个数字的最低? [英] Fastest way to find out minimum of 3 numbers?

查看:274
本文介绍了最快的办法,找出3个数字的最低?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个程序我写的20%的时间被花费在一个内部循环找出最小3个数字的,在这个例程:

In a program I wrote, 20% of the time is being spent on finding out the minimum of 3 numbers in an inner loop, in this routine:

static inline unsigned int
min(unsigned int a, unsigned int b, unsigned int c)
{
    unsigned int m = a;
    if (m > b) m = b;
    if (m > c) m = c;
    return m;
}

有什么办法加快这?我行与装配code也适用于x86 / x86_64的。

Is there any way to speed this up? I am ok with assembly code too for x86/x86_64.

编辑:在回答一些评论的:结果
*编译器使用是gcc 4.3.3

*至于组装而言,我只是一个初学者在那里。我问在这里集会,以了解如何做到这一点。 :)

*我有一个四核Intel 64的运行,所以MMX / SSE等的支持。

*很难在这里发表的循环,但我可以告诉你们,这是Levenshtein算法的高度优化的实现。

In reply to some of the comments:
* Compiler being used is gcc 4.3.3
* As far as assembly is concerned, I am only a beginner there. I asked for assembly here, to learn how to do this. :)
* I have a quad-core Intel 64 running, so MMX/SSE etc. are supported.
* It's hard to post the loop here, but I can tell you it's a heavily optimized implementation of the levenshtein algorithm.

这是编译器给我的非内嵌版分钟:

This is what the compiler is giving me for the non-inlined version of min:

.globl min
    .type   min, @function
min:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %edx
    movl    12(%ebp), %eax
    movl    16(%ebp), %ecx
    cmpl    %edx, %eax
    jbe .L2
    movl    %edx, %eax
.L2:
    cmpl    %ecx, %eax
    jbe .L3
    movl    %ecx, %eax
.L3:
    popl    %ebp
    ret
    .size   min, .-min
    .ident  "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
    .section    .note.GNU-stack,"",@progbits

的内联版本中-O2优化code(甚至是我的标志MRK = 0xfefefefe,前后通话后,MIN())越来越优化掉海湾合作委员会,所以我不能弄个。

The inlined version is within -O2 optimized code (even my markers mrk = 0xfefefefe, before and after the call to min()) are getting optimized away by gcc, so I couldn't get hold of it.

更新:我测试由尼尔斯建议的修改,ephemient,但有没有明显的性能提升,我得到通过分的程序集版本()。但是,我编译带有-march = i686的节目,我的猜测是因为整个程序越来越的新的更快的指令GCC与此选项产生的收益获得了12.5%的提升。感谢您的帮助球员。

Update: I tested the changes suggested by Nils, ephemient, however there's no perceptible performance boost I get by using the assembly versions of min(). However, I get a 12.5% boost by compiling the program with -march=i686, which I guess is because the whole program is getting the benefits of the new faster instructions that gcc is generating with this option. Thanks for your help guys.

P.S。 - 我用红宝石Profiler来衡量业绩(我的C计划是一个Ruby程序加载的共享库),所以只能由Ruby程序称为顶级C函数,它最终调用分钟,我能得到所花费的时间( )跌堆栈。请参阅本<一个href=\"http://stackoverflow.com/questions/2047211/how-do-you-profile-a-c-shared-library\">question.

P.S. - I used the ruby profiler to measure performance (my C program is a shared library loaded by a ruby program), so I could get time spent only for the top-level C function called by the ruby program, which ends up calling min() down the stack. Please see this question.

推荐答案

请确保您使用的是相应的 -march 设置,第一关。 GCC默认为不使用,不支持的i386的任何指示 - 允许它使用新的指令集可以在时间有很大的不同!在 -march = core2的-O2 我得到:

Make sure you are using an appropriate -march setting, first off. GCC defaults to not using any instructions that were not supported on the original i386 - allowing it to use newer instruction sets can make a BIG difference at times! On -march=core2 -O2 I get:

min:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %edx
    movl    12(%ebp), %ecx
    movl    16(%ebp), %eax
    cmpl    %edx, %ecx
    leave
    cmovbe  %ecx, %edx
    cmpl    %eax, %edx
    cmovbe  %edx, %eax
    ret

在这里使用CMOV可能会帮助您避免延误分支 - 你只需在 -march 传递得到它没有任何内联汇编。当内联到一个更大的功能,这是很可能是更有效的,可能只有四个装配作业。如果你需要的东西比这快,看看你是否能获得上证所的向量操作在整个算法的情况下工作。

The use of cmov here may help you avoid branch delays - and you get it without any inline asm just by passing in -march. When inlined into a larger function this is likely to be even more efficient, possibly just four assembly operations. If you need something faster than this, see if you can get the SSE vector operations to work in the context of your overall algorithm.

这篇关于最快的办法,找出3个数字的最低?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆