使用矢量化编译iPhone的Eigen库 [英] Compiling Eigen library for iPhone with vectorisation

查看:519
本文介绍了使用矢量化编译iPhone的Eigen库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力与eigen库的iPhone 4的编译,它有一个ARM处理器与armv7指令集。一切工作正常到目前为止,当我指定预处理器定义EIGEN_DONT_VECTORIZE。但由于一些性能问题,我想使用armv7优化的代码。



无论使用LLVM-GCC 4.2或LLVM CLang 2.0的编译器,我总是遇到编译错误。我想出了(或更好地认为),LLVM-GCC 4.2是唯一的方法来访问这些ARM-NEON特定的指令。



当我不设置EIGEN_DONT_VECTORIZE(并提供-mfloat-abi = softfp -mfpu = neon到gcc)我得到以下gcc编译器错误:



src / m3CoreLib / Eigen / src / Core / arch / NEON / PacketMath.h:89:错误:'__ extension __'之前的预期未限定ID



我已经阅读关于使用老gcc 4.2的问题和推荐使用较新版本的gcc。我不确定,但我相信这不是一个选择,因为应用商店批准。还有什么我可以做的,得到它编译为iPhone。

解决方案





在调试和发布设置之间有一个惊人的巨大差异关于Eigen的模板库方法:通过启用通常的优化标志来释放设置,让应用程序比调试运行快20到40倍。我从来没有在任何语言中看到这样的差异,从我的经验,它通常是1.5 - 3。



虽然我还是不能强制矢量化,即代码编译只有EIGEN_DONT_VECTORIZE定义,因此产生的效果符合我的需求。


I am struggling with the compilation of Eigen library for iPhone 4 which has an ARM processor with armv7 instruction set. Everything works fine so far when I specify the preprocessor define EIGEN_DONT_VECTORIZE. But due to some performance issues I would like to use armv7 optimised code.

Regardless which compiler I use LLVM-GCC 4.2 or LLVM CLang 2.0, I always run into compilation errors. I figured out (or better think so), that LLVM-GCC 4.2 is the only way to get access to these ARM-NEON specific instructions.

When I do not set EIGEN_DONT_VECTORIZE (and provide -mfloat-abi=softfp -mfpu=neon to gcc) I get the following gcc compiler error:

src/m3CoreLib/Eigen/src/Core/arch/NEON/PacketMath.h:89: error: expected unqualified-id before '__ extension__'

I have read about issues using the "old" gcc 4.2 and the recommendation to use a newer version of gcc. I am not sure but I believe this is not an option because of app store approval. Is there anything else I can do to get it compiled for iPhone.? Anybody out there who solved this?

Thanks, Kay

解决方案

After fiddling around with different compiler settings hours and hours I found myself a satisfying solution and came to following conclusion.

There is a surprisingly huge difference between debug and release settings regarding Eigen's template library approach: Release settings with usual optimisation flags enabled let the application run 20 to 40 times faster than debug. I have never seen such a difference before in any language, from my experience it is usually 1.5 - 3.

Although I still cannot force vectorisation i.e. code compiles only with EIGEN_DONT_VECTORIZE defined, the resulting performance fits my needs now.

这篇关于使用矢量化编译iPhone的Eigen库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆