CUDA，NPP过滤器 [英] CUDA, NPP Filters

查看：745 发布时间：2016/10/23 19:56:21 c++ image-processing cuda convolution npp

本文介绍了CUDA，NPP过滤器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

CUDA NPP库支持使用nppiFilter_8u_C1R命令过滤图像，但仍会收到错误。我没有问题，让boxFilterNPP示例代码启动并运行。

  eStatusNPP = nppiFilterBox_8u_C1R（oDeviceSrc.data（），oDeviceSrc.pitch （），
 oDeviceDst.data（），oDeviceDst.pitch（），
 oSizeROI，oMaskSize，oAnchor）;但是如果我改变它使用nppiFilter_8u_C1R，eStatusNPP会返回错误-24（NPP_TEXTURE_BIND_ERROR）。下面的代码是我对原始boxFilterNPP示例的更改。
  NppiSize oMaskSize = {5,5}; 
 npp :: ImageCPU_32s_C1 hostKernel（5,5）; 
 
 for（int x = 0; x <5; x ++）{
 for（int y = 0; y <5; y ++）{
 hostKernel.pixels x，y）[0] .x = 1; 
} 
} 
 
 npp :: ImageNPP_32s_C1 pKernel（hostKernel）; 
 
 Npp32s nDivisor = 1; 
 
 eStatusNPP = nppiFilter_8u_C1R（oDeviceSrc.data（），oDeviceSrc.pitch（），
 oDeviceDst.data（），oDeviceDst.pitch（），
 oSizeROI，
 pKernel.data（），
 oMaskSize，oAnchor，
 nDivisor）; 
  
这已在CUDA 4.2和5.0上尝试过，结果相同。
 
 
 当oMaskSize = {1,1} 
解决方案
当我将内核存储为 ImageCPU  /  ImageNPP 时，同样的问题。 
 
 
 一个好的解决方案是将内核作为传统1D数组存储在设备上。我试过这个，它给了我很好的结果（和没有那些不可预测或垃圾图像）。
 
 
 感谢Frank Jargstorff在这个StackOverflow post为一维的想法。 
  NppiSize oMaskSize = {5,5}; 
 Npp32s hostKernel [5 * 5]; 
 
 for（int x = 0; x <5; x ++）{
 for（int y = 0; y <5; y ++）{
 hostKernel [x * 5 + y] = 1; 
} 
} 
 
 Npp32s * pKernel; //只是GPU上的常规1D数组
 cudaMalloc（（void **）& pKernel，5 * 5 * sizeof（Npp32s））; 
 cudaMemcpy（pKernel，hostKernel，5 * 5 * sizeof（Npp32s），cudaMemcpyHostToDevice）; 
  
 
 
 
 
 
 使用这个原始图像，这里是模糊的结果，我从您的代码与1D内核数组：
  
 
 
 我使用的其他参数：
  Npp32s nDivisor = 25; 
 NppiPoint oAnchor = {4，4}; 
  
 
The CUDA NPP library supports filtering of image using the nppiFilter_8u_C1R command but keep getting errors. I have no problem getting the boxFilterNPP sample code up and running.
eStatusNPP = nppiFilterBox_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                                  oDeviceDst.data(), oDeviceDst.pitch(), 
                                  oSizeROI, oMaskSize, oAnchor);
But if I change it to use nppiFilter_8u_C1R instead, eStatusNPP return the error -24 (NPP_TEXTURE_BIND_ERROR). The code below is the alterations I made to the original boxFilterNPP sample.
NppiSize oMaskSize = {5,5};
npp::ImageCPU_32s_C1 hostKernel(5,5);

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel.pixels(x,y)[0].x = 1;
    }
}

npp::ImageNPP_32s_C1 pKernel(hostKernel);

Npp32s nDivisor = 1;

eStatusNPP = nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                               oDeviceDst.data(), oDeviceDst.pitch(), 
                               oSizeROI, 
                               pKernel.data(),
                               oMaskSize, oAnchor,
                               nDivisor);
This have been tried on CUDA 4.2 and 5.0, with same result.

The code runs with the expected result when oMaskSize = {1,1} 
 解决方案 
I had the same problem when I stored my kernel as an ImageCPU/ImageNPP. 

A good solution is to store the kernel as a traditional 1D array on the device. I tried this, and it gave me good results (and none of those unpredictable or garbage images).

Thanks to Frank Jargstorff in this StackOverflow post for the 1D idea.
NppiSize oMaskSize = {5,5};
Npp32s hostKernel[5*5];

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel[x*5+y] = 1;
    }
}

Npp32s* pKernel; //just a regular 1D array on the GPU
cudaMalloc((void**)&pKernel, 5 * 5 * sizeof(Npp32s));
cudaMemcpy(pKernel, hostKernel, 5 * 5 * sizeof(Npp32s), cudaMemcpyHostToDevice);




Using this original image, here's the blurred result that I get from your code with the 1D kernel array:


Other parameters that I used:
Npp32s nDivisor = 25;
NppiPoint oAnchor = {4, 4};


                        
这篇关于CUDA，NPP过滤器的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

CUDA，NPP过滤器 [英] CUDA, NPP Filters

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

CUDA，NPP过滤器 [英] CUDA, NPP Filters

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭