Cuda拜耳/ CFA去马赛克示例 [英] Cuda Bayer/CFA demosaicing example

查看：415 发布时间：2017/3/4 15:17:55 performance image cuda

本文介绍了Cuda拜耳/ CFA去马赛克示例的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我写了一个CUDA4拜耳去马赛克例程，但它比单线程CPU代码慢，运行在一个16核GTS250。

块大小是（16,16），图像昏暗是16的倍数 -

I've written a CUDA4 Bayer demosaicing routine, but it's slower than single threaded CPU code, running on a16core GTS250.
Blocksize is (16,16) and the image dims are a multiple of 16 - but changing this doesn't improve it.

Am I doing anything obviously stupid?

---------------调用例程------------------ uchar4 * d_output; size_t num_bytes; cudaGraphicsMapResources（1，& cuda_pbo_resource，0）; cudaGraphicsResourceGetMappedPointer（（void **）& d_output，& num_bytes，cuda_pbo_resource）; //进行转换，将结果保留在PBO中fordisplay kernel_wrapper（imageWidth，imageHeight，blockSize，gridSize，d_output）; cudaGraphicsUnmapResources（1，& cuda_pbo_resource，0）; --------------- cuda -------------------------- ----- texture< uchar，2，cudaReadModeElementType> tex; cudaArray * d_imageArray = 0; __global__ void convertGRBG（uchar4 * d_output，uint width，uint height） { uint x = __umul24（blockIdx.x，blockDim.x）+ threadIdx.x; uint y = __umul24（blockIdx.y，blockDim.y）+ threadIdx.y; uint i = __umul24（y，width）+ x; //输入是GR / BG输出是BGRA if（（x< width）&&（y< height））{ if（y& 0x01）{ if（x& 0x01）{ d_output [i] .x =（tex2D（tex，x + 1，y）+ tex2D（tex，x-1 ，y））/ 2; // B d_output [i] .y =（tex2D（tex，x，y））; // G in B d_output [i] .z =（tex2D（tex，x，y + 1）+ tex2D（tex，x，y-1））/ 2; // R } else { d_output [i] .x =（tex2D（tex，x，y））; // B d_output [i] .y =（tex2D（tex，x + 1，y）+ tex2D（tex，x-1，y）+ tex2D（tex，x，y + 1）+ tex2D tex，x，y-1））/ 4; // G d_output [i] .z =（tex2D（tex，x + 1，y + 1）+ tex2D（tex，x + 1，y-1）+ tex2D（tex，x-1，y +1）+ tex2D（tex，x-1，y-1））/ 4; // R } } else { if（x& 0x01）{ // odd col = R d_output [i] .y = （tex，x + 1，y + 1）+ tex2D（tex，x-1，y-1））/ 4; // B d_output [i] .z =（tex2D（tex，x，y））; // R d_output [i] .y =（tex2D（tex，x + 1，y）+ tex2D（tex，x-1，y）+ tex2D（tex，x，y + 1）+ tex2D tex，x，y-1））/ 4; // G } else { d_output [i] .x =（tex2D（tex，x，y + 1）+ tex2D（tex，x，y-1））/ 2; // B d_output [i] .y =（tex2D（tex，x，y））; // G in R d_output [i] .z =（tex2D（tex，x + 1，y）+ tex2D（tex，x-1，y））/ 2; // R } } } } void initTexture（int imageWidth，int imageHeight，uchar * imageata） { cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc（8，0，0，0，cudaChannelFormatKindUnsigned）; cutilSafeCall（cudaMallocArray（& d_imageArray，& channelDesc，imageWidth，imageHeight））; uint size = imageWidth * imageHeight * sizeof（uchar）; cutilSafeCall（cudaMemcpyToArray（d_imageArray，0，0，imagedata，size，cudaMemcpyHostToDevice））; cutFree（imagedata）; //使用点采样将数组绑定到纹理参考 tex.addressMode [0] = cudaAddressModeClamp; tex.addressMode [1] = cudaAddressModeClamp; tex.filterMode = cudaFilterModePoint; tex.normalized = false; cutilSafeCall（cudaBindTextureToArray（tex，d_imageArray））; }

Cuda拜耳/ CFA去马赛克示例 [英] Cuda Bayer/CFA demosaicing example

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

Cuda拜耳/ CFA去马赛克示例 [英] Cuda Bayer/CFA demosaicing example

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭