后处理`objdump的与ARM的周期数--disassemble` [英] Post process `objdump --disassemble` with ARM cycle counts

查看：224 发布时间：2016/5/29 14:27:52 gcc open-source arm objdump

本文介绍了后处理`objdump的与ARM的周期数--disassemble`的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

是否有一个脚本可用于后期处理一些 objdump的--disassemble 输出周期计数注释？特别是对于ARM系列。 <击>大部分的时间，这将只是一个模式匹配为伯爵表查找。我想喜欢 + 5M 对于可能需要五个内存周期。 Perl，Python和bash中， C 等都是精品。我觉得这个一般可以做，但我有兴趣在 ARM ，其中有一个的正交的指令集。这里是关于 68HC11 做同样的事情线程。该脚本将需要一个CPU的模式选项来选择合适的周期计数;我觉得这些罪名已经在 GCC存在机器描述。


Is there a script available for post processing some objdump --disassemble output to annotate with cycle counts?  Especially for the ARM family.  Most of the time this would only be a pattern match with a table lookup for the count.  I guess annotations like +5M for five memory cycles might be needed.  Perl, python, bash, C, etc are fine.  I think this can be done generically, but I am interested in the ARM, which has an orthogonal instruction set. Here is a thread on the 68HC11 doing the same thing.  The script would need an CPU model option to select the appropriate cycle counts; I think these counts already exist in the gcc machine description.
我不认为有一个 objdump的开关这一点，但RTFM将是巨大的。
I don't think there is an objdump switch for this, but RTFM would be great.
 编辑：：要澄清一下，假设，如最好的情况下内存子系统作为必当code从缓存中执行的情况都很好。我们的目标是不是100％准确的循环次数根据一些跑步机。就有可能得到一个合理的估计，否则编译器设计是不可能的。
 To clarify, assumptions such as best case memory sub-system as will be the case when the code executes from cache are fine.  The goal is not a 100% accurate cycle count as per some running machine.  It is possible to get a reasonable estimate, otherwise compiler design would be impossible.
由于 DWelch 指出，一个简单的运行总计是不可能的深流水线架构，就像最近的Cortex芯片。在 objdump的后处理将不得不看看周围的运算codeS。海湾合作委员会的插件更容易能够做到这一点，因为这是新的（4.5+），我不认为这样的事情存在。一种用于ARM926脚本当然是可能的，很简单。
As DWelch points out, a simple running total is not possible with deep pipelined architecture, like more recent Cortex chips.  The objdump post processing would have to look at surrounding opcodes.  A gcc plug-in is more likely to be able to accomplish this and as that is new (4.5+), I don't think such a thing exists.  A script for the ARM926 is certainly possible and fairly simple.
内存延迟并不重要。内存控制器就像是另一个 CPU 。它是这样做的业务，而CPU是做算术等好/以及调整算法将平行内存与计算访问。通过计算加载/存储和周期可以判断有多少并行操作来完成，当你积极与计时器轮廓。该管道是显著由于寄存器之间的联动，但对于基本块周期计数能够可靠地计量并用甚至现代的ARM处理器;这是一个简单的脚本太复杂了。
The memory latency doesn't matter.  The memory controller is like another CPU.  It is doing it's business while the CPU is doing arithmetic, etc.  A good/well tuned algorithm will parallel the memory accesses with the computations.  By counting loads/store and cycles you can determine how much parallelism is accomplished, when you actively profile with a timer.  The pipeline is significant due to interlocks between registers, but a cycle count for basic blocks can reliably be calculated and used even on modern ARM processors; this is too complex for a simple script.
推荐答案
有是一个在线工具，它估计周期数上的Cortex-A8 。然而，这种CPU是很老，并为其优化的程序可能是次优的较新的CPU。
There is an online tool which estimates cycle counts on Cortex-A8. However, this CPU is quite old, and programs optimized for it might be suboptimal on newer CPUs.
据我所知ARM还提供的Cortex-A9和Cortex-A5 周期在他们的RVDS软件 -accurate模拟器，但它是相当昂贵的。
AFAIK ARM also provides Cortex-A9 and Cortex-A5 cycle-accurate emulators in their RVDS software, but it is quite expensive.

                        这篇关于后处理`objdump的与ARM的周期数--disassemble`的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

后处理`objdump的与ARM的周期数--disassemble` [英] Post process `objdump --disassemble` with ARM cycle counts

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

后处理`objdump的与ARM的周期数--disassemble` [英] Post process `objdump --disassemble` with ARM cycle counts

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭