gpu :: morphologyEx比CPU中的morphologyEx慢? [英] gpu::morphologyEx is slower than morphologyEx in CPU?

查看:666
本文介绍了gpu :: morphologyEx比CPU中的morphologyEx慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写一个c ++代码,用于比较使用CPU和GPU版本的openCv的morphologyEx方法的性能。这是我的代码:

  #include< opencv2 / opencv.hpp> 
#include< opencv2 / gpu / gpu.hpp>
#include< sys / time.h>
#include< ctime>
using namespace cv;
using namespace std;


double start_timer()
{
double start_time =(double)getTickCount();
return start_time;
}

double end_timer(double start_time,int num_tests)
{
double time =(1000 *((double)getTickCount() - start_time)/ getTickFrequency ));
cout<< 平均时间< num_tests<< frames是:< time / num_tests<< ms< endl;
return time;
}


int main()
{
Mat cpuSrc;
cv :: gpu :: GpuMat src_gpu,dst_gpu;
Mat dst;
Mat元素;
int element_shape = MORPH_RECT;
element = getStructuringElement(element_shape,Size(10,10),Point(-1,-1));
cpuSrc = imread(images.jpeg,CV_LOAD_IMAGE_ANYDEPTH);

if(!cpuSrc.data)
{
cerr< 无法读取数据< endl;
return -1;
}


cout<< 开始计算CPU的时间.....< endl;
double start_time = start_timer();
int d = 0;
while(d <100)
{
cv :: morphologyEx(cpuSrc,dst,CV_MOP_OPEN,element,Point(-1,-1),1);
}

double total_time_cpu = end_timer(start_time,d);



// -------------------------------- ------------------------------
cout<< 开始计算GPU时间...< endl;
d = 0;
cv :: gpu :: GpuMat buf1,buf2;
gpu ::流流;
double start_time_1 = start_timer();

while(d< 100)
{
stream.enqueueUpload(cpuSrc,src_gpu);
cv :: gpu :: morphologyEx(src_gpu,dst_gpu,CV_MOP_OPEN,element,
buf1,buf2,Point(-1,-1),1,stream);
stream.enqueueDownload(dst_gpu,dst);

}
stream.waitForCompletion();
double total_time_gpu = end_timer(start_time_1,d);

cout<< 增益是:< total_time_cpu / total_time_gpu<< endl;
return 0;
}





使用循环,好像我在模拟包含100帧的视频。我使用NVIDIA公司GF110 [GeForce GTX 570]和英特尔公司的Xeon E5 / Core i7 DMI2。此外,我测试了上传和下载的时间,它在第一帧非常大,但之后它可以被忽略大约上传它是0.02ms每帧和下载是0.1ms,主要时间消耗是与格式EX操作。








此模拟的时间结果为如下:


对于CPU形态版本,
100帧的平均时间为:: 0.027349 ms,对于GPU版本:: 18.0128 ms


你能帮我找出这种意想不到的效果的原因吗?非常感谢你。

解决方案

在初始化时你应该调用:

  cv :: gpu :: setDevice ; 

它将加速初始化。


I am writing a c++ code for comparing the performance of morphologyEx method of opencv using the CPU and GPU versions. Here is my code:

#include <opencv2/opencv.hpp>
#include <opencv2/gpu/gpu.hpp>
#include <sys/time.h>       
#include <ctime>
using namespace cv;
using namespace std;


double start_timer()
{
     double start_time = (double) getTickCount();
     return start_time;
}

double end_timer(double start_time,int num_tests)
{
    double time = (1000 * ((double) getTickCount() - start_time)/ getTickFrequency());
    cout << "Average time of " << num_tests  << " frames is: " << time/num_tests <<  " ms" << endl;
    return time;
}


int main()
{
    Mat cpuSrc;
    cv::gpu::GpuMat src_gpu, dst_gpu;
    Mat dst;
    Mat element;
    int element_shape = MORPH_RECT;
    element = getStructuringElement(element_shape, Size(10, 10 ), Point(-1, -1) );
    cpuSrc = imread("images.jpeg",CV_LOAD_IMAGE_ANYDEPTH);

    if (!cpuSrc.data)
    {
        cerr << "Cannot read the data" << endl;
        return -1;
    }


    cout << "Starting calculating time for CPU ....." << endl;
    double start_time = start_timer();
    int d = 0;
    while(d<100)
    {
        cv::morphologyEx(cpuSrc, dst, CV_MOP_OPEN, element,Point(-1,-1),1);
    }

    double total_time_cpu = end_timer(start_time,d);



//--------------------------------------------------------------
    cout << "Starting calculating time for GPU ....." << endl;
    d = 0;
    cv::gpu::GpuMat buf1, buf2;
    gpu::Stream stream;
    double start_time_1 = start_timer();

    while(d<100)
    {
        stream.enqueueUpload(cpuSrc, src_gpu);
        cv::gpu::morphologyEx(src_gpu,dst_gpu,CV_MOP_OPEN,element,
                   buf1,buf2,Point(-1,-1),1,stream);
        stream.enqueueDownload(dst_gpu, dst);

    }
    stream.waitForCompletion();
    double total_time_gpu = end_timer(start_time_1,d);

    cout << "Gain is: " << total_time_cpu / total_time_gpu << endl;
    return 0;
}

I am using a loop as if i am simulating a video that contains 100 frames. I am using NVIDIA Corporation GF110 [GeForce GTX 570] and Intel Corporation Xeon E5/Core i7 DMI2. Moreover, i tested the time for uploading and downloading and it is very large in the first frame but after that it can be neglected approximately for uploading it is 0.02ms per frame and downloading is 0.1ms and the main time consumption is with the morphologyEx operation.


The time results for this simulations are as follows:

for CPU morphology version, The average time of 100 frames is:: 0.027349 ms and for the GPU version is:: 18.0128 ms

Could you please help me to figure out what might be the reasons for such unexpected performance?!!

Thank you so much in advance.

解决方案

In the initialization you should call:

cv::gpu::setDevice(0);

It will speed up initialization.

这篇关于gpu :: morphologyEx比CPU中的morphologyEx慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆