为什么我的pcl cuda代码在CPU而不是GPU中运行? [英] Why is my pcl cuda code running in CPU instead of GPU?

查看:1579
本文介绍了为什么我的pcl cuda代码在CPU而不是GPU中运行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用pcl/gpu命名空间的代码:

I have a code where I use the pcl/gpu namespace:

pcl::gpu::Octree::PointCloud clusterCloud;
clusterCloud.upload(cloud_filtered->points);

pcl::gpu::Octree::Ptr octree_device (new pcl::gpu::Octree);
octree_device->setCloud(clusterCloud);
octree_device->build();

/*tree->setCloud (clusterCloud);*/

// Create the cluster extractor object for the planar model and set all the parameters
std::vector<pcl::PointIndices> cluster_indices;
pcl::gpu::EuclideanClusterExtraction ec;
ec.setClusterTolerance (0.1);
ec.setMinClusterSize (2000);
ec.setMaxClusterSize (250000);
ec.setSearchMethod (octree_device);
ec.setHostCloud (cloud_filtered);

ec.extract (cluster_indices);

我已经安装了CUDA,并包括了所需的pcl/gpu".hpp".它可以编译(我有一个带有ROS的柳絮工作区),当我运行它时,它的运行速度确实很慢.我使用nvidia-smi,我的代码仅在CPU中运行,我不知道为什么以及如何解决它.

I have installed CUDA and included the needed pcl/gpu ".hpp"s to do this. It compiles (I have a catkin workspace with ROS) and when I do run it works really slow. I used nvidia-smi and my code is only running in the CPU, and I don't know why and how to solve it.

此代码是此处gpu/segmentation示例的实现: pcl/seg.cpp

This code is an implementation of the gpu/segmentation example here: pcl/seg.cpp

推荐答案

(之所以回答,因为评论太久了.)

(Making this an answer since it's too long for a comment.)

我不知道pcl,但这也许是因为您传递了主机端std::vector而不是设备端的数据.

I don't know pcl, but maybe it's because you pass a host-side std::vector rather than data that's on the device side.

...什么是主机端"和设备端"?什么是std?

... what is "host side" and "device side", you ask? And what's std?

好吧,std只是供以下人员使用的命名空间 C ++标准库. std::vector 是C ++标准库中的(模板)类动态地为您放入其中的元素分配内存.

Well, std is just a namespace used by the C++ standard library. std::vector is a (templated) class in the C++ standard library, which dynamically allocates memory for the elements you put in it.

问题是,std::vector使用的内存是与GPU无关的主系统内存(RAM).但是您的pcl库可能要求您传递GPU内存中的数据-不能是std::vector中的数据.您需要分配设备侧内存,然后从主机侧内存中复制数据.

The thing is, the memory std::vector uses is your main system memory (RAM) which doesn't have anything to do with the GPU. But it's likely that your pcl library requires that you pass data that's in GPU memory - which can't be the data in an std::vector. You would need to allocate device-side memory and copy your data there from the host side memory.

另请参阅:

为什么我们没有访问主机端的设备内存?

,并参考 CUDA编程指南如何执行此分配和复制(至少,如何以最低的级别执行;您的"pcl"可能对此具有自己的功能.)

and consult the CUDA programming guide regarding how to perform this allocation and copying (at least, how to perform it at the lowest possible level; your "pcl" may have its own facilities for this.)

这篇关于为什么我的pcl cuda代码在CPU而不是GPU中运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆