为了分析（-pg），为什么我的代码在多线程下运行速度慢于单线程时的运行速度？ [英] Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?

查看：200 发布时间：2018/4/20 17:23:56 linux performance multithreading gcc gprof

本文介绍了为了分析（-pg），为什么我的代码在多线程下运行速度慢于单线程时的运行速度？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最近，我在该程序中添加了线程，以利用我的i5 Quad Core上的其他内核。

在奇怪的一系列事件中，应用程序的调试版本现在运行速度较慢，但优化版本的运行速度比添加线程之前要快。

我将-g -pg标志传递给gcc作为调试版本和优化版本的-O3标志。

主机系统：Ubuntu Linux 10.4 AMD64。

我知道调试符号为程序增加了很大的开销，但相对性能一直保持不变。即在调试和优化构建中，更快的算法总是会运行得更快。

任何想法为什么我看到这种行为？

调试版本使用-g3 -pg编译。使用-O3优化版本。

 优化无线程：0m4.864s 
优化线程：0m2.075s 
 
调试无线程：0m30.351s 
调试线程：0m39.860s 
在strip之后调试线程：0m39.767s 
 
调试无线程no-pg）：0m10.428s 
调试线程（no-pg）：0m4.045s

<这让我确信，-g3不是怪怪性能差异的三角洲，而是它相当于-pg开关。很可能-pg选项会添加某种锁定机制来衡量线程性能。

由于-pg在线程应用程序中被破坏，我会

旗？这不是调试符号（不会影响代码生成），这是用于分析（它确实）。

多线程过程中的分析需要额外的锁定这会降低多线程版本的速度，甚至会导致它比非多线程版本慢。

I'm writing a ray tracer.

Recently, I added threading to the program to exploit the additional cores on my i5 Quad Core.

In a weird turn of events the debug version of the application is now running slower, but the optimized build is running faster than before I added threading.

I'm passing the "-g -pg" flags to gcc for the debug build and the "-O3" flag for the optimized build.

Host system: Ubuntu Linux 10.4 AMD64.

I know that debug symbols add significant overhead to the program, but the relative performance has always been maintained. I.e. a faster algorithm will always run faster in both debug and optimization builds.

Any idea why I'm seeing this behavior?

Debug version is compiled with "-g3 -pg". Optimized version with "-O3".
Optimized no threading: 0m4.864s Optimized threading: 0m2.075s Debug no threading: 0m30.351s Debug threading: 0m39.860s Debug threading after "strip": 0m39.767s Debug no threading (no-pg): 0m10.428s Debug threading (no-pg): 0m4.045s
This convinces me that "-g3" is not to blame for the odd performance delta, but that it's rather the "-pg" switch. It's likely that the "-pg" option adds some sort of locking mechanism to measure thread performance.

Since "-pg" is broken on threaded applications anyway, I'll just remove it.
解决方案
What do you get without the -pg flag? That's not debugging symbols (which don't affect the code generation), that's for profiling (which does).

It's quite plausible that profiling in a multithreaded process requires additional locking which slows the multithreaded version down, even to the point of making it slower than the non-multithreaded version.

这篇关于为了分析（-pg），为什么我的代码在多线程下运行速度慢于单线程时的运行速度？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为了分析（-pg），为什么我的代码在多线程下运行速度慢于单线程时的运行速度？ [英] Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

为了分析（-pg），为什么我的代码在多线程下运行速度慢于单线程时的运行速度？ [英] Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭