cudaMemcpy2D的未处理异常 [英] Unhandled exception with cudaMemcpy2D

查看:67
本文介绍了cudaMemcpy2D的未处理异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是C ++的新手(以及Cuda和OpenCV),所以对我的任何错误深表歉意.我有一个使用Cuda的现有代码.最近,它使用.png(已解码)作为输入,但是现在我使用相机生成实时图像.这些图像是代码的新输入.在这里:

I am new to C++ (aswell as Cuda and OpenCV), so I am sorry for any mistakes on my side. I have an existing code that uses Cuda. Recently it worked with .png (that was decoded) as an input but now I use a camera to generate live images. These images are the new input for the code. Here it is:

using namespace cv;

INT height = 2160;
INT width = 3840;
Mat image(height, width, CV_8UC3);
size_t pitch;
uint8_t* image_gpu;

// capture image
VideoCapture camera(0);
camera.set(CAP_PROP_FRAME_WIDTH, width);
camera.set(CAP_PROP_FRAME_HEIGHT, height);
camera.read(image);

// here I checked if image is definitly still a CV_8UC3 Mat with the initial height and width; and it is

cudaMallocPitch(&image_gpu, &pitch, width * 4, height);

// here I use cv::Mat::data to get the pointer to the data of the image:
cudaMemcpy2D(image_gpu, pitch, image.data, width*4, width*4, height, cudaMemcpyHostToDevice);

代码可以编译,但是我在最后一行(cudaMemcpy2D)收到抛出异常"的错误代码:在realtime.exe中,在0x00007FFE838D6660(nvcuda.dll)处引发异常:0xC0000005:访问冲突读取位置0x000001113AE10000.

The code compiles but I get an "Exception Thrown" at the last line (cudaMemcpy2D) with the following error code: Exception thrown at 0x00007FFE838D6660 (nvcuda.dll) in realtime.exe: 0xC0000005: Access violation reading location 0x000001113AE10000.

Google没有给我答案,我也不知道ho从这里开始.

Google did not give me an answer and I do not know ho to proceed from here on.

感谢任何提示!

推荐答案

将OpenCV Mat复制到使用 cudaMallocPitch 分配的设备内存的一种相当通用的方法是利用步骤 Mat 对象的code>成员.另外,在分配设备内存时,必须牢记直观的感觉,即如何分配设备内存以及如何将 Mat 对象复制到该对象.这是一个简单的示例,演示了使用 VideoCapture 捕获的视频帧的过程.

A rather generic way to copy an OpenCV Mat to the device memory allocated using cudaMallocPitch is to utilize the step member of the Mat object. Also, while allocating device memory, you must have a visual intuition in mind that how the device memory will be allocated and how the Mat object will be copied to it. Here is a simple example demonstrating the procedure for a video frame captured using VideoCapture.

#include<iostream>
#include<cuda_runtime.h>
#include<opencv2/opencv.hpp>

using std::cout;
using std::endl;

size_t getPixelBytes(int type)
{
    switch(type)
    {
        case CV_8UC1:
        case CV_8UC3:
            return sizeof(uint8_t);
            break;
        case CV_16UC1:
        case CV_16UC3:
            return sizeof(uint16_t);
            break;
        case CV_32FC1:
        case CV_32FC3:
            return sizeof(float);
            break;
        case CV_64FC1:
        case CV_64FC3:
            return sizeof(double);
            break;
        default:
            return 0;
    }
}

int main()
{
    cv::VideoCapture cap(0);
    cv::Mat frame;

    if(cap.grab())
    {
        cap.retrieve(frame);
    }
    else
    {
        cout<<"Cannot read video"<<endl;
        return -1;
    }

    uint8_t* gpu_image;
    size_t gpu_pitch;

    //Get number of bytes occupied by a single pixel. Although VideoCapture mostly returns CV_8UC3 type frame thus pixelBytes is 1 , but just in case.
    size_t pixelBytes = getPixelBytes(frame.type());

    //Number of actual data bytes occupied by a row.
    size_t frameRowBytes = frame.cols * frame.channels * pixelBytes;

    //Allocate pitch linear memory on device
    cudaMallocPitch(&gpu_image, &gpu_pitch, frameRowBytes , frame.rows);

    //Copy memory from frame to device mempry
    cudaMemcpy2D(gpu_image, gpu_pitch, frame.ptr(), frame.step, frameRowBytes, frame.rows, cudaMemcpyHostToDevice);

   //Rest of the code ...
   return 0;
}

免责声明:代码是在浏览器中编写的.尚未测试.请根据需要添加 CUDA错误检查

Disclaimer: Code is written in the browser. Not tested yet. Please add CUDA error checking as required

这篇关于cudaMemcpy2D的未处理异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆