OPEN MP位图矩阵计算 [英] OPEN MP bitmap-matrix computation

查看:96
本文介绍了OPEN MP位图矩阵计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

世界您好!我目前正在尝试为bitmap-matrix和matrix-bitmap操作开发一种好方法.我在位图上使用了锁定位,而且一切都很快.但是最近我了解了OPENMP,这让我开始思考是否可以在IO操作中使用它.当我尝试如下所示的"omp for"时,我在导入/导出的图像中遇到了一些随机颜色的麻烦.我了解有关并行计算的理论,但是在实际代码中,我很挣扎.在这种情况下,有人可以为OPEN MP的有效实现提供帮助吗?

(这是Windows :: forms C ++应用,当前用于24位位图)

Hello world! I am currently trying to develop a good way for bitmap-matrix , matrix - bitmap operation. I used lockbits on bitmap and things were pretty fast. But recently i learned about OPENMP and it got me thinking whether it is possible to use it on IO operation like this. When i tried "omp for" as shown bellow i get into trouble with some random colors in imported/exported images. I understand theory about parallel computing but in real code i struggle. Can anybody help with an working implementation of OPEN MP for this case ?

(this is windows::forms c++ app,currently for 24 bit bitmap)

int i,j,pos;
int width=bitmap->Width;
int height=bitmap->Height;

float **mat;
mat=new float*[width];
for(i=0;i<width;i++)
    mat[i]=new float [height];

BitmapData ^bitmapData;
bitmapData=gcnew BitmapData();
System::Drawing::Rectangle a(0,0,width,height);
bitmap->LockBits(a, System::Drawing::Imaging::ImageLockMode::ReadOnly, bitmap->PixelFormat, bitmapData);
unsigned char *ptr=(unsigned char*)bitmapData->Scan0.ToPointer();

int stride=bitmapData->Stride;
//#pragma omp parallel for private(j)
for (int i=0;i<width;i++)
    for(int j=0;j<height;j++)
    {
        pos=(j*stride)+(i*3);
        mat[i][j]=(float)(ptr[pos]+ptr[pos+1]+ptr[pos+2])/3;
    }
bitmap->UnlockBits(bitmapData);
return mat;

推荐答案

您正在所有线程之间共享单个"pos"变量.

您将获得随机值,因为每个线程共享一个相同的位置并在该位置进行读写.谁知道线程到达下一行时它将具有什么值?即使在这一行中,该值也可以在执行中更改.

试试这个.

You''re sharing the single ''pos'' variable among all of your threads.

You''ll get random values, as each thread is sharing the same location and writing and reading from it. Who knows what value it will have by the time the thread gets to the next line? And even in that line, the value can change in mid execution.

Try this.

int pos=(j*stride)+(i*3);



尝试这样可以提高速度.



You might get some speed improvement trying it like this.

#pragma omp parallel for private(i)
for (int i=0;i<width;i++)
{
    unsigned char * LinePtr = ptr + (j * stride);

    for(int j=0;j<height;j++)
    {
        mat[i][j]=(float)((int) *(LinePtr++) + *(LinePtr++) + *(LinePtr++))/3;
    }
}



产生线程有一些开销.因此,与其为每个像素生成一个线程,不如为每一行生成一个线程.然后,它们将在生成和破坏之间进行更多的计算.

通过预先计算线,我们将乘法数除以高度.
增量也比添加快.通过从内部循环中删除j变量,它为编译器提供了更多优化空间.现在它只有两个变量要操作:"mat"和"LinePtr".

还要注意,在添加之前,我将像素转换为int.添加无符号字符可以导致在255边界处进行值换行.



There is some overhead in spawning the threads. So rather than spawn a thread for each pixel, spawn one for each line. They''ll then get more calculations done between the spawn and the destructon.

By precalculating the line, we divide the number of multiplications by height.
Increments are also faster than adds. And by removing the j variable from the inner loop, it gives the compiler more room to optimize. Now it only has two variables to manipulate, ''mat'' and ''LinePtr''.

Also notice that I cast the pixels to an int before adding. Adding unsigned chars can lead to value wrapping at the 255 boundary.


谢谢,您已经很好地清除了它.我是非托管代码的新手,但看来您的修复中有一个小错误……以防万一有人读过此书,我相信它应该是这样.但是增量和行指针的想法非常出色.
Thanks , you cleared it very good. I am new to unmanaged code to but it seem there is a little error in your fix ... just in case somebody reads this i believe it should be like. But the idea of increments and line pointers is brilliant thanks a lot.
#pragma omp parallel for private(i)
for (i=0;i<height;i++)
{
    unsigned char * LinePtr = ptr + (i * stride);

    for(int j=0;j<width;j++)
    {
        mat[j][i]=(float)((int) *(LinePtr++) + (int)*(LinePtr++) + (int)*(LinePtr++))/3;
    }
}


感谢类型转换提示.关于那个取消引用...

thanks for the typecasting hint. about that dereferencing ...

float * arrptr=mat[i];
        for(int j=0;j<width;j++)
        {
            *(arrptr++)=(float)((int) *(LinePtr++) + *(LinePtr++) + *(LinePtr++))/3;
        }



我当时想是个好主意,但在那种情况下,我可能不得不将float **矩阵的内部尺寸更改为[width] [height],您有更好的建议吗?

我正在做类似高斯模糊和拉普拉斯的东西,因此将其保持在一个一维数组中确实会使事情搞砸了.但是在我的代码中,我有很多使用偶数引用mat [i] [j] [u] [v]等的两倍甚至三倍的...我认为它很慢,但是我有更好的选择吗?



i was thinking sth like that would be good idea but in that case i would presumably have to change internal dimensions of float ** matrix to [width][height] do you have som better suggestion ?

i am doing stuff like gauss blur and laplace of this so keeping it in one dimensional array would really mess things up. But in my code i have a lot of double even triple for''s with dereferencing mat[i][j][u][v] etc ... i consider it to be slow but do i have a better choice ??


这篇关于OPEN MP位图矩阵计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆