算法切片平面(代替)从RGB值的阵列的 [英] Algorithm for slicing planes (in place) out of an array of RGB values

查看:165
本文介绍了算法切片平面(代替)从RGB值的阵列的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经有了一个字节的RGB值的平面数组,即进入 R1 G1 B1 R2 G2 B2 R3 G3 B3 ...... Rn的GN BN 。所以,我的数据是这样的:

I've got a flat array of byte RGB values that goes R1 G1 B1 R2 G2 B2 R3 G3 B3 ... Rn Gn Bn. So my data looks like:

char imageData[WIDTH * HEIGHT * 3];

但我想一个宽*高数组传递到现有的C库,预计这一数据的单一平面。这将是一个序列的只是的R值(或只是对G,或者只是在B)的

But I want to pass a WIDTH*HEIGHT array to an existing C library that expects a single plane of this data. That would be a sequence of just the R values (or just the G, or just the B).

这是很容易分配一个新的数组和复制数据(杜)。但图像非常大。如果不是一个C库,但采取了某种迭代接口巧妙的切片穿越的,那将是巨大的。但我不能编辑code我打电话......它想一个普通的老指向的连续内存块。

It's easy enough to allocate a new array and copy the data (duh). But the images are very large. If it weren't a C library but took some kind of iteration interface to finesse the "slicing" traversal, that would be great. But I can't edit the code I'm calling...it wants a plain old pointer to a block of sequential memory.

不过我有写权限此阵。这是可行的创建一个例行程序,将整理成彩色面。我还会需要一个逆变换,将放回去,但根据定义该排序成平面相同的方法可以适用于不排序它

HOWEVER I have write access to this array. It is viable to create a routine that would sort it into color planes. I'd also need a reverse transformation that would put it back, but by definition the same method that sorted it into planes could be applied to unsort it.

如何高效能我(的地方),把这个数组 R1 R2 R3 ... Rn的G1 G2 G3 ... GN B1,B2,B3,...... BN 和再回来吗?任何非幼稚的算法?

How efficiently can I (in place) turn this array into R1 R2 R3 ... Rn G1 G2 G3 ... Gn B1 B2 B3 ... Bn and then back again? Any non-naive algorithms?

推荐答案

本文一个简单的就地算法在洗牌介绍了如何转2 * N矩阵,并给出了一个提示如何做到这一点的其他情况,所以3 * N可能也是可能的。 这个回答另一个问题表明,它的确是可能的。

This paper "A Simple In-Place Algorithm for In-Shuffle" describes how to transpose matrix of 2*N and gives a hint how to do it for other cases, so 3*N may be also possible. This answer to other question shows that it is indeed possible.

或者,使用的算法将每个值到它的转置的地方,然后执行相同的由该地方的值,依此类推,直到周期相连接。标志位向量处理后的值。而继续下去,直到这个载体是全1。

Or use an algorithm which writes each value to its transposed place, then does the same for the value from that place, and so on until cycle is connected. Flag processed values in a bit vector. And continue until this vector is all 1s.

这两种算法都没有缓存友好。也许一些巧妙的运用preFETCH指令可以改善这一点。

Both algorithms are not cache-friendly. Probably some clever use of PREFETCH instruction can improve this.

编辑:

C ++,RGB单架飞机,而不是优化的:

C++, RGB to single planes, not optimized:

#include <iostream>
#include <bitset>
#include <vector>

enum {N = 8};

void transpose(std::vector<char>& a)
{
  std::bitset<3*N> b;

  for (int i = 1; i < 3*N; ++i)
  {
    if (b[i])
      continue;

    int ptr = i;
    int next;
    char nextVal = a[i];

    do {
      next = ptr/3 + N*(ptr%3);
      char prevVal = nextVal;
      nextVal = a[next];
      a[next] = prevVal;
      ptr = next;
      b[ptr] = true;
    }
    while (ptr != i);
  }
}

int main()
{
  std::vector<char> a(3*N);

  for (int i = 0; i != 3*N; ++i)
    a[i] = i;

  transpose(a);

  for (int i = 0; i != 3*N; ++i)
    std::cout << (int)a[i] << std::endl;

  return 0;
}

我的原意是使用尺寸宽*高,中位向量赋予宽*高/ 8的开销。但它始终是可能牺牲速度为空间。该位向量可以是大小宽度或高度,或任何需要的价值,甚至是0。关键是要保持一个指针到单元格中,所有值都转前已。位向量为细胞,从该指针开始。后它是全1,它被移动到下一个位置,则所有的算法步骤,除了实际数据移动来执行。而位向量是准备继续调换。这种变体是O(N ^ 2),而不是O(N)。

My original intent is to use a bit vector of size WIDTH*HEIGHT, which gives overhead of WIDTH*HEIGHT/8. But it is always possible to sacrifice speed for space. The bit vector may be of size WIDTH or HEIGHT or any desirable value, even 0. The trick is to maintain a pointer to the cell, before which all values are transposed. The bit vector is for cells, starting from this pointer. After it is all 1s, It is moved to next position, then all the algorithm steps are performed except actual data movement. And the bit vector is ready to continue transposing. This variant is O(N^2) instead of O(N).

EDIT2:

preFITCH优化并不难实现:刚才计算指标,调用preFETCH,并把索引到一个队列(ringbuffer),然后从队列中获取索引和移动数据

PREFITCH optimization is not difficult to implement: just calculate indexes, invoke PREFETCH, and put indexes to a queue (ringbuffer), then get indexes from the queue and move data.

EDIT3:

等算法的思想,这是O(1)的大小,O(N *日志(N))的时间,是高速缓存友好,可能比循环算法(图像尺寸和LT; 1GB)的速度更快:

The idea of other algorithm, which is O(1) size, O(N*log(N)) time, is cache-friendly and may be faster than "cycle" algorithm (for image sizes < 1Gb):

  • 在拆分N * 3矩阵字符数3 * 3的矩阵转置到
  • 拆分结果的char [3]的3 * 3的矩阵转置到
  • 在继续,而矩阵大小小于数组大小
  • 现在,我们有多达3 * 2 * log3中(N)订购件。加入他们的行列。
  • 在第一次加入相同大小的块。长度为4的非常简单的循环可以被使用。
  • 加入不等大小的块反向(反向(一),反向(B))

这篇关于算法切片平面(代替)从RGB值的阵列的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆