Cuda虚拟类 [英] Cuda virtual class

查看：163 发布时间：2017/3/4 13:23:37 c++ cuda virtual-functions

本文介绍了Cuda虚拟类的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在cuda内核中执行一些虚拟方法，而不是在同一个内核中创建对象，我想在主机上创建它，并将其复制到gpu内存。

I would like to execute some virtual methods in a cuda kernel, but instead of creating the object in the same kernel I would like to create it on the host and copy it to gpu memory.

我在内核中成功创建对象并调用了一个虚方法。复制对象时出现问题。这是有道理的，因为显然虚函数指针是假的。
发生什么是简单的Cuda网格启动失败，至少这是Nsight说。
但是当看看SASS时，它会对虚函数指针的取消引用崩溃。这是有道理的。

I am successfully creating objects in a kernel and call a virtual method. The problem arises when copying the object. This makes sense because obviously the virtual function pointer is bogus. What happens is simply "Cuda grid launch failed", at least this is what Nsight says. But when having a look at the SASS it crashes on the dereferencing of the virtual function pointer, which makes sense.

我当然使用Cuda 4.2以及在适配卡上使用compute_30进行编译。

I am of course using Cuda 4.2 as well as compiling with "compute_30" on a fitting card.

那么，推荐的方法是什么？

So what is the recommended way to go? Or is this feature simply not supported?

我有一个想法，首先运行一个不同的内核，创建虚拟对象，并提取虚拟函数指针补丁我的对象复制它们。可悲的是，这并不是真正的工作（还没有想出来），以及它将是一个丑陋的解决方案。

I had the idea to run a different kernel first which creates dummy objects and extract the virtual function pointer to "patch" my objects before copying them. Sadly this is not really working (haven't figured it out yet) as well as it would be an ugly solution.

这实际上是此问题的重新运行，这个问题从未完全回答。

P.S. This is actually a rerun of this question, which sadly was never fully answered.

编辑：

所以我找到了一种方式来做我想要的。但只是为了清楚：这不是一个答案或解决方案，答案已经提供，这只是一个黑客，只是为了好玩。

So I found a way to do what I wanted. But just to be clear : This is not at all an answer or solution, the answer was already provided, this is only a hack, just for fun.

看看Cuda在调用虚方法时做什么，下面是debug SASS

So first lets see what Cuda is doing when calling a virtual method, below is debug SASS

//R0 is the address of our object
LD.CG R0, [R0];  
IADD R0, R0, 0x4;  
NOP;  
MOV R0, R0;  
LD.CG R0, [R0];
...
IADD R0, RZ, R9;  
MOV R0, R0;  
LDC R0, c[0x2][R0];
...
BRX R0 - 0x5478

0x2] [INDEX]对于所有内核都是常量，我们可以通过运行一个内核来获取一个类的索引，其中obj是一个新创建的类的对象：

So assuming that "c[0x2][INDEX]" is constant for all kernels we can just get the index for a class by just running a kernel and doing this, where obj is a newly created object of the class looking at:

unsigned int index = *(unsigned int*)(*(unsigned int*)obj + 4);

然后使用这样的：

struct entry
{
    unsigned int vfptr;// := &vfref, thats our value to store in an object
    int dummy;// := 1234, great for debugging
    unsigned int vfref;// := &dummy
    unsigned int index;
    char ClassName[256];//use it as a key for a dict
};

将此内容存储在主机以及设备内存（内存位置是设备内存）以及主机上可以使用ClassName作为对patch的对象的查找。

Store this in host aswell as device memory(the memory locations are device ones) and on the host you can use the ClassName as a lookup for an object to "patch".

但是，我不会在任何严重的情况下使用这个，因为性能明智的虚拟函数不是伟大的。

But again : I would not use this in anything serious, because performance wise, virtual functions are not great at all.

Cuda虚拟类 [英] Cuda virtual class

问题描述

推荐答案

D.2.6.3虚函数

D.2.6.3 Virtual Functions

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

Cuda虚拟类 [英] Cuda virtual class

问题描述

推荐答案

D.2.6.3虚函数

D.2.6.3 Virtual Functions

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭