当cuda内核正在运行时访问cuda设备内存 [英] Accessing cuda device memory when the cuda kernel is running

查看:187
本文介绍了当cuda内核正在运行时访问cuda设备内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已使用cudaMalloc分配内存在设备上,并已将其传递给内核函数。是否可以在内核完成执行之前从主机访问该内存?

I have allocated memory on device using cudaMalloc and have passed it to a kernel function. Is it possible to access that memory from host before the kernel finishes its execution?

推荐答案

我可以想到的唯一方法是在内核仍在执行时启动的memcpy是通过在与内核不同的流中提交异步memcpy。 (如果你使用内核启动或异步memcpy的默认API,NULL流将强制两个操作序列化。)

The only way I can think of to get a memcpy to kick off while the kernel is still executing is by submitting an asynchronous memcpy in a different stream than the kernel. (If you use the default APIs for either kernel launch or asynchronous memcpy, the NULL stream will force the two operations to be serialized.)

但是因为没有办法将内核的执行与流同步,该代码将受制于竞争条件。即复制引擎可能从尚未被内核写入的内存中提取。

But because there is no way to synchronize a kernel's execution with a stream, that code would be subject to a race condition. i.e. the copy engine might pull from memory that hasn't yet been written by the kernel.

引用映射固定内存的人是什么东西:如果内核写入到映射的固定存储器,它在其完成处理时有效地将数据复制到主机存储器。这个成语工作得很好,只要内核不会再触及数据。

The person who alluded to mapped pinned memory is into something: if the kernel writes to mapped pinned memory, it is effectively "copying" data to host memory as it finishes processing it. This idiom works nicely, provided the kernel will not be touching the data again.

这篇关于当cuda内核正在运行时访问cuda设备内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆