指令级并行性的探索 [英] Instruction-Level-Parallelism Exploration
问题描述
如果有任何usefuls工具,在那里,让我利用这个指令级并行性的一些算法,我只是想知道。更具体而言,我的一个子集
从多媒体域算法,我不知道什么是利用ILP的最佳方式
在这个算法。所有这些算法是用C语言实现,所以最好我给这些算法输入一些工具,它告诉我哪些指令可以并行执行。
I am just wondering if there are any usefuls tools out there that allow me to exploit the Instruction-Level-Parallelism in some algorithms. More specifically, I have a subset of algorithms from the multimedia domain and I wonder what is the best way to exploit ILP in this algorithms. All this algorithms are implemented in C, so ideally I give these algorithms as input to some tool and it tells me which instructions could be executed in parallel.
任何点非常感谢!
罗伯特
推荐答案
的问题是,决定是否指令将并行执行是相当困难的考虑有多少不同类型的处理器也有。您的目标CPU架构的一个很好的了解会给你一个很好的起点做这样的工作。没有软件会打一个人的心灵与正确的知识。
The problem is that deciding whether an instruction will be executed in parallel is quite difficult considering how many different processor types there are. A good understanding of the CPU architecture you are targeting will give you a good starting point for doing this sort of work. No software will beat a human mind with the right knowledge.
虽然在一般这么多的工作是由编译器之类的东西乱序执行引擎完成的,这试图得到尽可能多的从你尽可能远离抽象。你会发现,即使认识到这一点充分其不太可能你会比百分之几的速度提升获得更多的。
In general though so much work is done by the compiler and things like Out-of-order execution engines that this tries to get abstracted as much away from you as possible. You will find even by understanding this fully its unlikely you'll get more than a few percent speed improvement.
如果你想看到严重的速度提升你好得多重写算法充分利用多处理器和可用SIMD操作。您可以使用SIMD单独看到严肃的速度提升,这尤其是对大量的多媒体算法,可以同时处理数据。
If you want to see serious speed improvements you are far better off re-writing the algorithm to take advantage of multiple processors and available SIMD operations. You can see serious speed improvements using SIMD alone and this is especially so for a lot of "multimedia algorithms" that can process multiple elements of the data simultaneously.
这篇关于指令级并行性的探索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!