ARM 上的快速浮点到 int 转换和浮点精度 (iPhone 3GS/4) [英] Fast float to int conversion and floating point precision on ARM (iPhone 3GS/4)

查看:20
本文介绍了ARM 上的快速浮点到 int 转换和浮点精度 (iPhone 3GS/4)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读过(http://www.stereopsis.com/FPU.html)(什么是最快的方法在 x86 上将 float 转换为 int).有谁知道慢速简单转换(见下面的片段)是否也适用于 ARM 架构?

I read (http://www.stereopsis.com/FPU.html) mentioned in (What is the fastest way to convert float to int on x86). Does anyone know if the slow simple cast (see snippet below) does apply to ARM architecture, too?

inline int Convert(float x)
{
  int i = (int) x;
  return i;
}

要应用 FPU 文章中提到的一些技巧,您必须设置浮点运算的精度.我如何在 ARM 上执行此操作?

To apply some tricks mentioned in the FPU article you have to set the precision for floating point operations. How do I do that on ARM?

ARM 架构上最快的 float 到 int 转换是什么?

What is the fastest float to int conversion on ARM architecture?

谢谢!

推荐答案

简短版本,没有".

那篇文章很古老,甚至不适用于现代 x86 系统,更不用说 ARM.在 ARMv7 (iPhone 3GS/4) 上,简单地转换为整数的速度相当快,尽管将数据从 VFP/NEON 寄存器移动到通用寄存器时会有一个适度的停顿.但是,考虑到您的 float 数据可能来自在 VFP/NEON 寄存器中完成的计算,无论您如何进行转换,您都必须为此支付费用.

That article is ancient and doesn't even apply to modern x86 systems, let alone ARM. A simple cast to integer is reasonably fast on ARMv7 (iPhone 3GS/4), though there is a modest stall moving data from the VFP/NEON registers to the general purpose registers. However, given that your float data is probably coming from a computation done in VFP/NEON registers, you will have to pay for that move no matter how you do the conversion.

我不认为这是一条有利可图的优化路径,除非您有迹象表明这是您的程序的主要瓶颈.即便如此,最快的转换也是你不做的转换;找到算法方法来消除程序中的转换几乎总是会更好.

I don't think that this is a profitable path for optimization unless you have traces showing that this is a major bottleneck for your program. Even then, the fastest conversion is the conversion you don't do; you will almost always be better off finding algorithmic ways to eliminate conversions from your program.

如果您真正需要优化转换,请查看 vcvt.i32.f32 指令,它将两个或四个浮点数的向量转换为一个向量两个或四个整数 将数据移出 NEON 寄存器(因此,不会引起我提到的停顿).当然,您需要在 NEON 单元上进行后续的整数计算,这样才能实现有利可图的优化.

If you do genuinely need to optimize conversions, look into the vcvt.i32.f32 instruction, which converts a vector of two or four floating point numbers to a vector of two or four integers without moving the data out of the NEON registers (and therefore, without incurring the stall that I mentioned). Of course, you will need to do your subsequent integer computations on the NEON unit for this to be a profitable optimization.

问题:你真正想要做什么?为什么你认为你需要更快的 float->int 转换?

Question: What are you really trying to do? Why do you think you need a faster float->int conversion?

这篇关于ARM 上的快速浮点到 int 转换和浮点精度 (iPhone 3GS/4)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆