CUDA,NPP过滤器 [英] CUDA, NPP Filters

查看:745
本文介绍了CUDA,NPP过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

CUDA NPP库支持使用nppiFilter_8u_C1R命令过滤图像,但仍会收到错误。我没有问题,让boxFilterNPP示例代码启动并运行。

  eStatusNPP = nppiFilterBox_8u_C1R(oDeviceSrc.data(),oDeviceSrc.pitch (),
oDeviceDst.data(),oDeviceDst.pitch(),
oSizeROI,oMaskSize,oAnchor);但是如果我改变它使用nppiFilter_8u_C1R,eStatusNPP会返回错误-24(NPP_TEXTURE_BIND_ERROR)。下面的代码是我对原始boxFilterNPP示例的更改。

  NppiSize oMaskSize = {5,5}; 
npp :: ImageCPU_32s_C1 hostKernel(5,5);

for(int x = 0; x <5; x ++){
for(int y = 0; y <5; y ++){
hostKernel.pixels x,y)[0] .x = 1;
}
}

npp :: ImageNPP_32s_C1 pKernel(hostKernel);

Npp32s nDivisor = 1;

eStatusNPP = nppiFilter_8u_C1R(oDeviceSrc.data(),oDeviceSrc.pitch(),
oDeviceDst.data(),oDeviceDst.pitch(),
oSizeROI,
pKernel.data(),
oMaskSize,oAnchor,
nDivisor);

这已在CUDA 4.2和5.0上尝试过,结果相同。



当oMaskSize = {1,1}

解决方案

当我将内核存储为 ImageCPU / ImageNPP 时,同样的问题。



一个好的解决方案是将内核作为传统1D数组存储在设备上。我试过这个,它给了我很好的结果(和没有那些不可预测或垃圾图像)。



感谢Frank Jargstorff在这个StackOverflow post为一维的想法。

  NppiSize oMaskSize = {5,5}; 
Npp32s hostKernel [5 * 5];

for(int x = 0; x <5; x ++){
for(int y = 0; y <5; y ++){
hostKernel [x * 5 + y] = 1;
}
}

Npp32s * pKernel; //只是GPU上的常规1D数组
cudaMalloc((void **)& pKernel,5 * 5 * sizeof(Npp32s));
cudaMemcpy(pKernel,hostKernel,5 * 5 * sizeof(Npp32s),cudaMemcpyHostToDevice);






使用这个原始图像,这里是模糊的结果,我从您的代码与1D内核数组:



我使用的其他参数:

  Npp32s nDivisor = 25; 
NppiPoint oAnchor = {4,4};


The CUDA NPP library supports filtering of image using the nppiFilter_8u_C1R command but keep getting errors. I have no problem getting the boxFilterNPP sample code up and running.

eStatusNPP = nppiFilterBox_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                                  oDeviceDst.data(), oDeviceDst.pitch(), 
                                  oSizeROI, oMaskSize, oAnchor);

But if I change it to use nppiFilter_8u_C1R instead, eStatusNPP return the error -24 (NPP_TEXTURE_BIND_ERROR). The code below is the alterations I made to the original boxFilterNPP sample.

NppiSize oMaskSize = {5,5};
npp::ImageCPU_32s_C1 hostKernel(5,5);

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel.pixels(x,y)[0].x = 1;
    }
}

npp::ImageNPP_32s_C1 pKernel(hostKernel);

Npp32s nDivisor = 1;

eStatusNPP = nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                               oDeviceDst.data(), oDeviceDst.pitch(), 
                               oSizeROI, 
                               pKernel.data(),
                               oMaskSize, oAnchor,
                               nDivisor);

This have been tried on CUDA 4.2 and 5.0, with same result.

The code runs with the expected result when oMaskSize = {1,1}

解决方案

I had the same problem when I stored my kernel as an ImageCPU/ImageNPP.

A good solution is to store the kernel as a traditional 1D array on the device. I tried this, and it gave me good results (and none of those unpredictable or garbage images).

Thanks to Frank Jargstorff in this StackOverflow post for the 1D idea.

NppiSize oMaskSize = {5,5};
Npp32s hostKernel[5*5];

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel[x*5+y] = 1;
    }
}

Npp32s* pKernel; //just a regular 1D array on the GPU
cudaMalloc((void**)&pKernel, 5 * 5 * sizeof(Npp32s));
cudaMemcpy(pKernel, hostKernel, 5 * 5 * sizeof(Npp32s), cudaMemcpyHostToDevice);


Using this original image, here's the blurred result that I get from your code with the 1D kernel array:

Other parameters that I used:

Npp32s nDivisor = 25;
NppiPoint oAnchor = {4, 4};

这篇关于CUDA,NPP过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆