为什么 Cuda 运行时在初始化时会保留 80 GiB 虚拟内存? [英] Why does the Cuda runtime reserve 80 GiB virtual memory upon initialization?

查看:11
本文介绍了为什么 Cuda 运行时在初始化时会保留 80 GiB 虚拟内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在分析我的 Cuda 4 程序,结果发现在某个阶段运行的进程使用了​​超过 80 GiB 的虚拟内存.这比我预期的要多得多.在检查了内存映射随时间的演变并比较了它正在执行的代码行之后,结果发现在这些简单的指令之后,虚拟内存使用量飙升至超过 80 GiB:

I was profiling my Cuda 4 program and it turned out that at some stage the running process used over 80 GiB of virtual memory. That was a lot more than I would have expected. After examining the evolution of the memory map over time and comparing what line of code it is executing it turned out that after these simple instructions the virtual memory usage bumped up to over 80 GiB:

  int deviceCount;
  cudaGetDeviceCount(&deviceCount);
  if (deviceCount == 0) {
    perror("No devices supporting CUDA");
  }

显然,这是第一次 Cuda 调用,因此运行时已初始化.在此之后,内存映射看起来像(截断):

Clearly, this is the first Cuda call, thus the runtime got initialized. After this the memory map looks like (truncated):

Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000   89796   14716       0 r-x--  prg
0000000005db1000      12      12       8 rw---  prg
0000000005db4000      80      76      76 rw---    [ anon ]
0000000007343000   39192   37492   37492 rw---    [ anon ]
0000000200000000    4608       0       0 -----    [ anon ]
0000000200480000    1536    1536    1536 rw---    [ anon ]
0000000200600000 83879936       0       0 -----    [ anon ]

现在将这个巨大的内存区域映射到虚拟内存空间.

Now with this huge memory area mapped into virtual memory space.

好吧,这可能不是一个大问题,因为在 Linux 中保留/分配内存并没有多大作用,除非您实际写入该内存.但这真的很烦人,因为例如 MPI 作业必须指定作业可用的最大 vmem 数量.而 80GiB 只是 Cuda 工作的下限——还必须添加所有其他内容.

Okay, its maybe not a big problem since reserving/allocating memory in Linux doesn't do much unless you actually write to this memory. But it's really annoying since for example MPI jobs have to be specified with the maximum amount of vmem usable by the job. And 80GiB that's s just a lower boundary then for Cuda jobs - one has to add all other stuff too.

我可以想象这与 Cuda 维护的所谓暂存空间有关.一种可以动态增长和收缩的内核代码内存池.但那是猜测.它也分配在设备内存中.

I can imagine that it has to do with the so-called scratch space that Cuda maintains. A kind of memory pool for kernel code that can dynamically grow and shrink. But that's speculation. Also it's allocated in device memory.

有什么见解吗?

推荐答案

与暂存空间无关,它是寻址系统的结果,它允许主机和多个 GPU 之间的统一寻址和点对点访问.CUDA 驱动程序使用内核的虚拟内存系统在单个虚拟地址空间中注册所有 GPU 内存 + 主机内存.这实际上并不是内存消耗,它只是将所有可用地址空间映射到线性虚拟空间以进行统一寻址的技巧".

Nothing to do with scratch space, it is the result of the addressing system that allows unified andressing and peer to peer access between host and multiple GPUs. The CUDA driver registers all the GPU(s) memory + host memory in a single virtual address space using the kernel's virtual memory system. It isn't actually memory consumption, per se, it is just a "trick" to map all the available address spaces into a linear virtual space for unified addressing.

这篇关于为什么 Cuda 运行时在初始化时会保留 80 GiB 虚拟内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆