数据的快速交错 [英] Fast Interleaving of Data
问题描述
我正在与一些硬件(硬件本身并不重要),而我需要拆分的一些数据块介绍独立的部分,以使事情运行得更快。
所以我有,例如存储器X字的连续块长。对于可视化,我整理成下面的50字方针:
001 002 003 004 005 006 007 ...
051 052 053 054 055 056 057 ...
101 102 103 104 105 106 107 ...
151 152 153 154 155 156 157 ...
我需要这些拆分为四个独立块的快捷方式:
块1
001 003 005 007 ...
101 103 105 107 ...
块2
002 004 006 ...
102 104 106 ...
块3
051 053 055 057 ...
151 153 155 157 ...
块4
052 054 056 ...
152 154 156 ...
或者,基本上是:
块1块2块1块2 ...
块3块4块3块4 ...
块1块2块1块2 ...
块3块4块3块4 ...
现在这样做很简单,只要使用for循环。但什么是这样做的更优化/并行的方式? (无MPI的东西,这种情况发生在一个应用程序运行在桌面上)。
所以,总结起来,仅仅是明确的:
-
我有数据,如上图所示。
-
我发送这个数据多台设备(PC外部)。此数据需要被发送沿着电线作为4个单独的块(以单独的设备)。
这是一个最好的例子,SSE可以帮助你。这是非常擅长的数据搅乱以及从存储器和回流数据。在一些非x86架构上,也有类似的ISA扩展(如将AltiVec技术)
I'm working with some piece of hardware (the hardware itself is not important) and I need to split some block data intro separate pieces in order to make the thing run faster.
So I have, for example a contiguous block of memory X words long. For visualization, I'm arranging it into 50 word lines below:
001 002 003 004 005 006 007 ...
051 052 053 054 055 056 057 ...
101 102 103 104 105 106 107 ...
151 152 153 154 155 156 157 ...
I need a fast way of splitting these into four separate blocks:
Block1
001 003 005 007 ...
101 103 105 107 ...
Block2
002 004 006 ...
102 104 106 ...
Block3
051 053 055 057 ...
151 153 155 157 ...
Block4
052 054 056 ...
152 154 156 ...
Or, basically:
Block1 Block2 Block1 Block2 ...
Block3 Block4 Block3 Block4 ...
Block1 Block2 Block1 Block2 ...
Block3 Block4 Block3 Block4 ...
Now doing this is as simple as using for-loops. But what is a more optimized/parallel way of doing this? (No MPI stuff, this happens on an app running on the desktop).
So summing it up, just to be clear:
I have data as shown above.
I'm sending this data several devices (outside the PC). This data needs to be sent down the wire as 4 separate blocks (to the separate devices).
This is a prime example where SSE can help you. It's very good at data shuffling as well as streaming data from memory and back. On some non-x86 architectures, there are similar ISA extensions available (e.g. AltiVec)
这篇关于数据的快速交错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!