IOS / iPad / iPhone的最高速度 [英] Maximum speed from IOS/iPad/iPhone

查看:126
本文介绍了IOS / iPad / iPhone的最高速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 OpenCV iOS 完成了计算密集型应用。当然很慢。但它比我的PC原型慢200倍。所以我正在优化它。从最初的15秒开始,我的速度达到了0.4秒。我想知道我是否找到了所有的东西以及其他人想要分享的东西。我做了什么:

I done computing intensive app using OpenCV for iOS. Of course it was slow. But it was something like 200 times slower than my PC prototype. So I was optimizing it down. From very first 15 seconds I was able to get 0.4 seconds speed. I wonder if I found all things and what others may want to share. What I did:


  1. 在OpenCV中替换了 double 数据类型到 float 。 Double是64bit和32bit CPU无法轻松处理它们,所以float给了我一些速度。 OpenCV经常使用double。

  1. Replaced "double" data types inside OpenCV to "float". Double is 64bit and 32bit CPU cannot easily handle them, so float gave me some speed. OpenCV uses double very often.

为编译器选项添加了 -mpfu = neon 。副作用是模拟器编译器不再工作的新问题,任何东西都只能在本机硬件上测试。

Added "-mpfu=neon" to compiler options. Side-effect was new problem that emulator compiler does not work anymore and anything can be tested on native hardware only.

替换 sin( ) cos()具有90个值查找表的实现。加速是巨大的!这与PC有些相反,而这种优化并没有给出任何加速。代码以度为单位工作,此值已转换为 sin() cos()的弧度。此代码也已删除。但查找表完成了这项工作。

Replaced sin() and cos() implementation with 90 values lookup tables. Speedup was huge! This is somewhat opposite to PC where such optimizations does not give any speedup. There was code working in degrees and this value was converted to radians for sin() and cos(). This code was removed too. But lookup tables did the job.

启用拇指优化。一些博客文章推荐完全相反,但这是因为拇指在 armv6 上通常会变慢。 armv7 没有任何问题,让事情变得更快更小。

Enabled "thumb optimizations". Some blog posts recommend exactly opposite but this is because thumb makes things usually slower on armv6. armv7 is free of any problems and makes things just faster and smaller.

要确保拇指优化并且 -mfpu = neon 最好工作,不要引入崩溃我完全删除了armv6目标。我的所有代码都编译为 armv7 ,这也在app store中列为需求。这意味着最低 iPhone 3GS 。我认为放弃旧款可以。无论如何,较旧的CPU具有较慢的CPU和CPU密集型应用程序,如果安装在旧设备上,则会提供糟糕的用户体验。

To make sure thumb optimizations and -mfpu=neon work at best and do not introduce crashes I removed armv6 target completely. All my code is compiled to armv7 and this is also listed as requirement in app store. This means minimum iPhone will be 3GS. I think it is OK to drop older ones. Anyway older ones have slower CPUs and CPU intensive app provides bad user experience if installed on old device.

当然我使用 -O3 flag

我从OpenCV删除了死代码。通常在优化OpenCV时,我会看到我的项目显然不需要的代码。例如,通常有一个额外的if()来检查像素大小是8位还是32位,我知道我只需要8位。这将删除一些代码,为优化器提供更好的机会来删除更多内容或替换为常量。代码也更适合缓存。

I deleted "dead code" from OpenCV. Often when optimizing OpenCV I see code which is clearly not needed for my project. For example often there is a extra "if()" to check for pixel size being 8 bit or 32 bit and I know that I need 8bit only. This removes some code, provides optimizer better chance to remove something more or replace with constants. Also code fits better into cache.

还有其他任何技巧和想法吗?对我来说,启用拇指和用查找替换三角函数是提升制造商,让我感到惊讶。也许你知道更多的事情让应用程序飞起来?

Any other tricks and ideas? For me enabling thumb and replacing trigonometry with lookups were boost makers and made me surprise. Maybe you know something more to do which makes apps fly?

推荐答案

如果你正在进行大量的浮点计算,它会使用Apple的加速框架对您有很大帮助。它旨在使用浮点硬件并行地对向量进行计算。

If you are doing a lot of floating point calculations, it would benefit you greatly to use Apple's Accelerate framework. It is designed to use the floating point hardware to do calculations on vectors in parallel.

我还会逐一解决你的观点:

I will also address your points one by one:

1)这不是因为CPU,这是因为从armv7时代开始,浮点处理器硬件中只会计算32位浮点运算(因为苹果取代了硬件)。 64位的将用软件计算。作为交换,32位操作变得更快。

1) This is not because of the CPU, it is because as of the armv7-era only 32-bit floating point operations will be calculated in the floating point processor hardware (because apple replaced the hardware). 64-bit ones will be calculated in software instead. In exchange, 32-bit operations got much faster.

2)NEON是新浮点处理器指令集的名称

2) NEON is the name of the new floating point processor instruction set

3)是的,这是众所周知的方法。另一种方法是使用我上面提到的Apple框架。它提供了sin和cos函数,可以并行计算4个值。这些算法在汇编和NEON中都经过精细调整,因此它们在使用最少的电池时可以提供最大的性能。

3) Yes, this is a well known method. An alternative is to use Apple's framework that I mentioned above. It provides sin and cos functions that calculate 4 values in parallel. The algorithms are fine tuned in assembly and NEON so they give the maximum performance while using minimal battery.

4)新的armv7实现拇指没有缺点ARMv6的。禁用建议仅适用于v6。

4) The new armv7 implementation of thumb doesn't have the drawbacks of armv6. The disabling recommendation only applies to v6.

5)是的,考虑到现在有80%的用户使用iOS 5.0或更高版本(armv6设备在4.2.1结束支持),这在大多数情况下都是完全可以接受的。

5) Yes, considering 80% of users are on iOS 5.0 or above now (armv6 devices ended support at 4.2.1), that is perfectly acceptable for most situations.

6)当您在发布模式下构建时会自动发生这种情况。

6) This happens automatically when you build in release mode.

7)是的,这不会但是,与上述方法一样有效。

7) Yes, this won't have as large an effect as the above methods though.

我的建议是检查加速。这样你就可以确保你正在利用浮点处理器的全部功能。

My recommendation is to check out Accelerate. That way you can make sure you are leveraging the full power of the floating point processor.

这篇关于IOS / iPad / iPhone的最高速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆