默认固定内存与零拷贝内存 [英] Default Pinned Memory Vs Zero-Copy Memory

查看:13
本文介绍了默认固定内存与零拷贝内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 CUDA 中,我们可以使用固定内存来更有效地将数据从主机复制到 GPU,而不是通过主机上的 ma​​lloc 分配的默认内存.然而,有两种类型的固定内存,默认固定内存零拷贝固定内存.

In CUDA we can use pinned memory to more efficiently copy the data from Host to GPU than the default memory allocated via malloc at host. However there are two types of pinned memories the default pinned memory and the zero-copy pinned memory.

默认固定内存将数据从主机复制到 GPU 的速度是正常传输的两倍,因此绝对有优势(前提是我们有足够的主机内存来进行页面锁定)

The default pinned memory copies the data from Host to GPU twice as fast as the normal transfers, so there's definitely an advantage (provided we have enough host memory to page-lock)

在不同版本的固定内存中,即零复制内存,我们不需要将数据从主机完全复制到 GPU 的 DRAM.内核直接从主机内存中读取数据.

In the different version of pinned memory, i.e. zero-copy memory, we don't need to copy the data from host to GPU's DRAM altogether. The kernels read the data directly from the Host memory.

我的问题是:这些固定内存类型中哪一种是更好的编程习惯.

My question is: Which of these pinned-memory type is a better programming practice.

推荐答案

我认为这取决于您的应用程序(否则,他们为什么会提供两种方式?)

I think it depends on your application (otherwise, why would they provide both ways?)

映射的固定内存(零拷贝)在以下情况下很有用:

  • GPU 本身没有内存,并且无论如何都使用 RAM

  • The GPU has no memory on its own and uses RAM anyway

您只加载一次数据,但您需要对其执行大量计算,并且您希望通过它隐藏内存传输延迟.

You load the data exactly once, but you have a lot of computation to perform on it and you want to hide memory transfer latencies through it.

主机端想要更改/添加更多数据,或读取结果,而内核仍在运行(例如通信)

The host side wants to change/add more data, or read the results, while kernel is still running (e.g. communication)

数据不适合 GPU 内存

The data does not fit into GPU memory

请注意,您还可以使用多个流来复制数据并并行运行内核.

Note that, you can also use multiple streams to copy data and run kernels in parallel.

固定但未映射的内存更好:

  • 当您多次加载或存储数据时.例如:您有多个后续内核,分步执行工作 - 无需每次都从主机加载数据.

  • When you load or store the data multiple times. For example: you have multiple subsequent kernels, performing the work in steps - there is no need to load the data from host every time.

不需要执行太多计算,加载延迟也不会很好地隐藏

There is not that much computation to perform and loading latencies are not going to be hidden well

这篇关于默认固定内存与零拷贝内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆