GPU - 系统内存映射 [英] GPU - System memory mapping

查看:303
本文介绍了GPU - 系统内存映射的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何系统内存(RAM)被映射为GPU的访问?我清楚如何将虚拟内存为CPU的工作,但我不知道怎么会当GPU访问GPU映射系统内存(主机)的工作。基本的东西涉及到如何将数据从系统内存复制到主机内存,反之亦然。你能否提供参考文章解释后盾吗?

How system memory (RAM) is mapped for GPU access? I am clear about how virtual memory works for cpu but am not sure how would that work when GPU accesses GPU-mapped system memory (host). Basically something related to how Data is copied from system-memory to host-memory and vice versa. Can you provide explanations backed by reference articles please?

推荐答案

我发现下面的slideset非常有用:<一href=\"http://developer.amd.com/afds/assets/$p$psentations/1004_final.pdf\">http://developer.amd.com/afds/assets/$p$psentations/1004_final.pdf

I found the following slideset quite useful: http://developer.amd.com/afds/assets/presentations/1004_final.pdf

存储器系统融合APUS
零拷贝的好处
皮埃尔Boudier
AMD
的OpenGL / OpenCL的院士
格雷厄姆·塞勒斯
AMD
OpenGL的经理

MEMORY SYSTEM ON FUSION APUS The Benefits of Zero Copy Pierre Boudier AMD Fellow of OpenGL/OpenCL Graham Sellers AMD Manager of OpenGL

AMD Fusion开发者峰会2011年6月

AMD Fusion Developer Summit June 2011

要注意,然而,这是一个快速移动的区域。与其说开发新的概念,如(最终)如虚拟内存,GPU的应用概念。让我来总结一下。

Be aware, however, this is a fast moving area. Not so much developing new concepts, as in (finally) applying concepts like virtual memory to GPUs. Let me summarize.

在过去,说在2010年之前,GPU的通常是单独的PCI或PCI-EXC preSS卡或板。他们船上有一些DRAM的GPU卡。这板载DRAM是pretty快。他们还可以访问在CPU侧的DRAM,典型地通过DMA跨越PCI拷贝引擎。 CPU存储器这样的GPU访问通常是相当缓慢。

In the old days, say prior to 2010, GPUs usually were separate PCI or PCI-excpress cards or boards. They had some DRAM on board the GPU card. This on-board DRAM is pretty fast. They could also access DRAM on the CPU side, typically via DMA copy engines across PCI. GPU access to CPU memory like this is usually quite slow.

GPU的内存不分页。对于这个问题,在GPU存储器通常不使用高速缓存,除了在GPU内的软件管理的高速缓存,像纹理高速缓存。 管理软件是指这些缓存不缓存相干的,必须手动刷新。

The GPU memory was not paged. For that matter, the GPU memory is usually uncached, except for the software managed caches inside the GPU, like the texture caches. "Software managed" means these caches are not cache coherent, and must be manually flushed.

典型地,只有CPU的DRAM中的一小部分是由GPU的访问。通常情况下,它被寄予厚望 - 不受分页。通常情况下,即使没有受到虚拟地址转换 - 典型的虚拟地址=物理地址,+或许有些偏差。

Typically, only a small section of the CPU DRAM was accessed by the GPU - an aperture. Typically, it was pinned - not subject to paging. Usually, not even subject to virtual address translation - typically virtual address = physical address, + maybe some offset.

(当然,CPU存储器的其余部分是正常的虚拟内存,寻呼,当然翻译和高速缓存。这只是在GPU不能安全地访问此,因为GPU不(没)不能访问虚拟内存子系统和缓存一致性的系统。

(Of course, the rest of CPU memory is properly virtual memory, paged, certainly translated, and cached. It's just that the GPU cannot access this safely, because the GPU does (did) not have access to the virtual memory subsystem and the cache coherence system.

现在,上面的作品,但它是一个痛苦。操作上的东西首先是CPU内,则GPU里面是缓慢的。容易出错。也是一个极大的安全隐患:用户提供GPU code往往可以访问(慢慢不安全)的所有CPU DRAM,因此可能被恶意软件使用

Now, the above works, but it's a pain. Operating on something first inside the CPU then inside the GPU is slow. Error prone. And also a great security risk: user provided GPU code often could access (slowly and unsafely) all CPU DRAM, so could be used by malware.

AMD已经宣布更紧密集成GPU和CPU的目标。其中的第一步是创建了融合的APU,同时包含CPU和GPU芯片。 (英特尔做过类似的SandyBridge用。我期待ARM也是这样做的)

AMD has announced a goal of more tightly integrating GPUs and CPUs. One of the first steps was to create the "Fusion" APUs, chips containing both CPUs and GPUs. (Intel has done similar with Sandybridge; I expect ARM also to do so.)

AMD公司还宣布,他们打算有GPU使用虚拟内存子系统,并使用高速缓存。

AMD has also announced that they intend to have the GPU use the virtual memory subsystem, and use caches.

在具有GPU使用的虚拟内存的方向迈出的一步是AMD IOMMU。英特尔也有类似的。虽然IOMMUs更面向比非虚拟机操作系统的虚拟内存的虚拟机。

A step in the direction of having the GPU use virtual memory is the AMD IOMMU. Intel has similar. Although the IOMMUs are more oriented towards virtual machines than virtual memory for non-virtual machine OSes.

系统中,CPU和GPU都在同一个芯片内通常具有CPU和GPU访问同一DRAM芯片。因此,不再有上GPU板和OFF-GPU - CPU。DRAM

Systems where the CPU and GPU are inside the same chip typically have the CPU and GPU access the same DRAM chips. So there is no longer "on-GPU-board" and "off-GPU--CPU" DRAM.

但是通常仍是系统主板到存储器上的分割,分区,DRAM的主要用于由CPU和内存主要​​用于由GPU。尽管内存可能住在同一个DRAM芯片内部,通常很大一部分是图形。在矿井纸上面被称为本地的内存,由于历史的原因。 CPU和显存可调节不同的 - 通常是GPU内存是较低的优先级,除了视频刷新,并具有较长的突发

But there usually still is a split, a partition, of the DRAM on the system motherboard into memory mainly used by the CPU, and memory mainly used by the GPU. Even though the memory may live inside the same DRAM chips, typically a big chunk is "graphics". Inthe paper above it is called "Local" memory, for historical reasons. CPU and Graphics memory may be tuned differently - typically the GPU memory is lower priority, except for video refresh, and has longer bursts.

在本文我是指你,有不同的内部总线:洋葱为系统内存,大蒜,用于向图形内存分区更快地访问。大蒜存储器通常未缓存。

In the paper I refer you to, there are different internal busses: Onion for "system" memory, and "Garlic" for faster access to the graphics memory partition. Garlic memory is typically uncached.

本文我指的是有关CPU和GPU如何有不同的页表的会谈。他们的字幕,零拷贝的好处是指CPU datastructurer映射到GPU的页表,这样你就不需要复制它。

The paper I refer to talks about how the CPU and GPU have different page tables. Their subtitle, "the benefits of zero copy" refers to mapping a CPU datastructurer into the GPU page tables, so that you don't need to copy it.

等等,等等,

系统的这个领域正在迅速发展,所以2011年的文件已经过时差不多。但你应该注意的趋势

This area of the system is evolving rapidly, so the 2011 paper is already almost obsolete. But you should note the trends

(一)软件WANTS统一访问CPU和GPU的内存 - 虚拟内存和缓存

(a) software WANTS uniform access to CPU and GPU memory - virtual memory and cacheable

(二)虽然硬件试图提供(a),特殊图形内存功能几乎总是独立显存,哪怕只是一个相同的DRAM分区,显著更快或省电。

(b) although hardware tries to provide (a), special graphics memory features nearly always make dedicated graphics memory, even if just a partition of the same DRAMs, significantly faster or power efficient.

的差距可能缩小,但每次你觉得它即将消失,其他硬件绝招可以播放。

The gap may be narrowing, but every time you think it is about to go away, another hardware trick can be played.

这篇关于GPU - 系统内存映射的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆