避免呼叫到floor() [英] Avoiding Calls to floor()

查看:197
本文介绍了避免呼叫到floor()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一段代码,我需要处理uvs(2D纹理坐标),不一定在0到1的范围。作为例子,有时我会得到一个uv与u组件是1.2。为了处理这个,我实现一个包装,通过执行以下操作导致平铺:

I am working on a piece of code where I need to deal with uvs (2D texture coordinates) that are not necessarily in the 0 to 1 range. As an example, sometimes I will get a uv with a u component that is 1.2. In order to handle this I am implementing a wrapping which causes tiling by doing the following:

u -= floor(u)
v -= floor(v)

这样做会使1.2变为0.2, 。它还处理负面情况,例如-0.4变为0.6。

Doing this causes 1.2 to become 0.2 which is the desired result. It also handles negative cases, such as -0.4 becoming 0.6.

然而,这些对楼层的调用相当缓慢。我已经使用英特尔VTune分析我的应用程序,我花了大量的周期只是做这个楼操作。

However, these calls to floor are rather slow. I have profiled my application using Intel VTune and I am spending a huge amount of cycles just doing this floor operation.

在这个问题上做了一些背景阅读后,我来了使用以下函数,它有点快,但仍然有很多需要(我仍然招致类型转换处罚等)。

Having done some background reading on the issue, I have come up with the following function which is a bit faster but still leaves a lot to be desired (I am still incurring type conversion penalties, etc).

int inline fasterfloor( const float x ) { return x > 0 ? (int) x : (int) x - 1; }

我已经看到了一些使用内联汇编完成的技巧,

I have seen a few tricks that are accomplished with inline assembly but nothing that seems to work exactly correct or have any significant speed improvement.

推荐答案

有没有人知道如何处理这种情况?解决方案

解决方案

所以你想要一个真正快速的float-> int转换? AFAIK int-> float转换是快速的,但至少在MSVC ++上,float-> int转换调用一个小的帮助函数ftol(),它做一些复杂的工作,以确保符合标准的转换完成。如果你不需要这样的严格转换,你可以做一些组装hackery,假设你在一个x86兼容的CPU。

So you want a really fast float->int conversion? AFAIK int->float conversion is fast, but on at least MSVC++ a float->int conversion invokes a small helper function, ftol(), which does some complicated stuff to ensure a standards compliant conversion is done. If you don't need such strict conversion, you can do some assembly hackery, assuming you're on an x86-compatible CPU.

这里有一个快速浮动的函数-to-int,使用MSVC ++内联汇编语法(它应该给你正确的想法):

Here's a function for a fast float-to-int which rounds down, using MSVC++ inline assembly syntax (it should give you the right idea anyway):

inline int ftoi_fast(float f)
{
    int i;

    __asm
    {
        fld f
        fistp i
    }

    return i;
}

在MSVC ++ 64位上,您需要一个外部.asm文件,因为64位编译器拒绝内联汇编。该函数基本上使用原始x87 FPU指令用于加载float(fld),然后将float存储为整数(fistp)。 (注意警告:你可以通过直接调整CPU上的寄存器来改变这里使用的舍入模式,但不要这样做,你会打破很多东西,包括MSVC的sin和cos的实现。)

On MSVC++ 64-bit you'll need an external .asm file since the 64 bit compiler rejects inline assembly. That function basically uses the raw x87 FPU instructions for load float (fld) then store float as integer (fistp). (Note of warning: you can change the rounding mode used here by directly tweaking registers on the CPU, but don't do that, you'll break a lot of stuff, including MSVC's implementation of sin and cos!)

如果您可以在CPU上承担SSE支持(或者有一个简单的方法来创建支持SSE的代码路径),您也可以尝试:

If you can assume SSE support on the CPU (or there's an easy way to make an SSE-supporting codepath) you can also try:

#include <emmintrin.h>

inline int ftoi_sse1(float f)
{
    return _mm_cvtt_ss2si(_mm_load_ss(&f));     // SSE1 instructions for float->int
}

相同(load float然后存储为整数),但使用SSE指令,这有点快。

...which is basically the same (load float then store as integer) but using SSE instructions, which are a bit faster.

其中之一应该覆盖昂贵的float-to-int情况,并且任何int到float转换应该仍然是便宜的。对不起是微软具体这里,但这是我做了类似的性能工作,我得到了很大的收获这种方式。如果可移植性/其他编译器是一个问题,你必须看看别的东西,但这些函数编译为可能两个指令采取< 5个时钟,而不是帮助函数,需要100多个时钟。

One of those should cover the expensive float-to-int case, and any int-to-float conversions should still be cheap. Sorry to be Microsoft-specific here but this is where I've done similar performance work and I got big gains this way. If portability/other compilers are an issue you'll have to look at something else, but these functions compile to maybe two instructions taking <5 clocks, as opposed to a helper function that takes 100+ clocks.

这篇关于避免呼叫到floor()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆