为了分析(-pg),为什么我的代码在多线程下运行速度慢于单线程时的运行速度? [英] Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?

查看:200
本文介绍了为了分析(-pg),为什么我的代码在多线程下运行速度慢于单线程时的运行速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



最近,我在该程序中添加了线程,以利用我的i5 Quad Core上的其他内核。



在奇怪的一系列事件中,应用程序的调试版本现在运行速度较慢,但​​优化版本的运行速度比添加线程之前要快。



我将-g -pg标志传递给gcc作为调试版本和优化版本的-O3标志。



主机系统:Ubuntu Linux 10.4 AMD64。



我知道调试符号为程序增加了很大的开销,但相对性能一直保持不变。即在调试和优化构建中,更快的算法总是会运行得更快。



任何想法为什么我看到这种行为?

调试版本使用-g3 -pg编译。使用-O3优化版本。

 优化无线程:0m4.864s 
优化线程:0m2.075s

调试无线程:0m30.351s
调试线程:0m39.860s
在strip之后调试线程:0m39.767s

调试无线程no-pg):0m10.428s
调试线程(no-pg):0m4.045s



<这让我确信,-g3不是怪怪性能差异的三角洲,而是它相当于-pg开关。很可能-pg选项会添加某种锁定机制来衡量线程性能。



由于-pg在线程应用程序中被破坏,我会

旗?这不是调试符号(不会影响代码生成),这是用于分析(它确实)。



多线程过程中的分析需要额外的锁定这会降低多线程版本的速度,甚至会导致它比非多线程版本慢。


I'm writing a ray tracer.

Recently, I added threading to the program to exploit the additional cores on my i5 Quad Core.

In a weird turn of events the debug version of the application is now running slower, but the optimized build is running faster than before I added threading.

I'm passing the "-g -pg" flags to gcc for the debug build and the "-O3" flag for the optimized build.

Host system: Ubuntu Linux 10.4 AMD64.

I know that debug symbols add significant overhead to the program, but the relative performance has always been maintained. I.e. a faster algorithm will always run faster in both debug and optimization builds.

Any idea why I'm seeing this behavior?

Debug version is compiled with "-g3 -pg". Optimized version with "-O3".

Optimized no threading:        0m4.864s
Optimized threading:           0m2.075s

Debug no threading:            0m30.351s
Debug threading:               0m39.860s
Debug threading after "strip": 0m39.767s

Debug no threading (no-pg):    0m10.428s
Debug threading (no-pg):       0m4.045s

This convinces me that "-g3" is not to blame for the odd performance delta, but that it's rather the "-pg" switch. It's likely that the "-pg" option adds some sort of locking mechanism to measure thread performance.

Since "-pg" is broken on threaded applications anyway, I'll just remove it.

解决方案

What do you get without the -pg flag? That's not debugging symbols (which don't affect the code generation), that's for profiling (which does).

It's quite plausible that profiling in a multithreaded process requires additional locking which slows the multithreaded version down, even to the point of making it slower than the non-multithreaded version.

这篇关于为了分析(-pg),为什么我的代码在多线程下运行速度慢于单线程时的运行速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆