如何剖析OpenMP瓶颈 [英] How to profile OpenMP bottlenecks
问题描述
我有一个已经由OpenMP并行化的循环,但由于任务的性质,有4个临界
子句。
I have a loop that has been parallelized by OpenMP, but due to the nature of the task, there are 4 critical
clauses.
什么是最好的方式来配置加速,找出哪些关键子句(或也许非关键(!))占用了循环内的最多时间?
What would be the best way to profile the speed up and find out which of the critical clauses (or maybe non-critical(!) ) take up the most time inside the loop?
我使用Ubuntu 10.04与g ++ 4.4.3
I use Ubuntu 10.04 with g++ 4.4.3
推荐答案
OpenMP包括函数omp_get_wtime )和omp_get_wtick()用于衡量时间效果( docs这里),我建议使用这些。
OpenMP includes the functions omp_get_wtime() and omp_get_wtick() for measuring timing performance (docs here), I would recommend using these.
否则尝试一个profiler。我更喜欢可以在此处找到的google CPU分析器。
Otherwise try a profiler. I prefer the google CPU profiler which can be found here.
还有这回答。
这篇关于如何剖析OpenMP瓶颈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!