GCC ARM 性能下降 [英] GCC ARM Performance drop

查看:23
本文介绍了GCC ARM 性能下降的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我偶然发现了 GCC 的一个非常奇怪的问题.问题是性能下降了 25%.这是故事.

I stumbled upon very strange issue with GCC. The issue is 25% drop in performance. Here is the story.

我有一个 fp32 计算密集型软件(用 TVM 编译的神经网络).我为 ARM(rk3399 设备)编译它,这里是信息:

I have a pice of software which is fp32 compute intensive (neural networks compiled with TVM). I compile it for ARM (rk3399 device), here is info:

gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror --enable-multilib --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12)

uname -a

Linux FriendlyELEC 4.4.143 #1 SMP Tue Nov 20 11:10:11 CST 2018 aarch64 aarch64 aarch64 GNU/Linux

lscpu

Architecture:          aarch64 
Byte Order:            Little Endian
CPU(s):                6
On-line CPU(s) list:   0-5
Thread(s) per core:    1
Core(s) per socket:    3
Socket(s):             2
Model name:            ARMv8 Processor rev 2 (v8l)
CPU max MHz:           1800.0000
CPU min MHz:           408.0000
Hypervisor vendor:     horizontal
Virtualization type:   full

代码最初慢"和cpp11,我决定尝试cpp17和cpp14.不支持 cpp17,但支持 cpp14.我切换到 cpp14,瞧,我的性能提高了 25% 左右.我真的对其进行了测试,以确保提升实际上是真实的,而不是测量错误.我的这种提升持续了一周,然后我的设备重新启动,性能提升消失了!

The code was initially "slow" and cpp11, I decided to try cpp17 and cpp14. cpp17 was not supported, but cpp14 was. I switched to cpp14 and voila I got boost around 25% in performance. I really tested it to make sure the boost is in fact real and not a measuring mistake. I had this boost for a week then my device rebooted and the boost in performance was gone!

这听起来可能很疯狂,但我对自己的代码和测量结果非常确定.在这个噱头之前,我没有明确的编译标志.现在我试图找出 GCC 的编译标志以回收丢失的内容,但我对 GCC 没有太多经验.这里可能有什么问题?哪些标志会如此影响性能?

It may sound crazy, but I'm very sure in my code and measurements I had. I didn't have explicit compile flags prior to this gimmick. Now I'm trying to figure out compile flags for GCC to reclaim what was lost, but I don't have much experience with GCC. What could be the issue here? What flags can affect performance that much?

代码使用.so文件,使用llvm和gcc编译

the code uses .so files, compiled with use of llvm and gcc

llvm -device=arm_cpu -target=armv8l-linux-gnueabihf -mattr=+neon,fp-armv8

推荐答案

这不是 GCC 错误.这是CPU频率缩放问题.我有一个带有 ARM 和 Linux (ubuntu) 的设备,奇怪的行为和不同的基准测试结果是由于操作系统控制的奇怪的 CPU 频率.

It's not GCC fault. It's CPU frequency scaling problem. I had device with ARM with Linux (ubuntu) on board, strange behavior and different benchmarking results are due to strange cpu frequency governing by OS.

这篇关于GCC ARM 性能下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆