我可以在cuda卡上实际分配多少内存 [英] How much memory can I actually allocated on a cuda card

查看:317
本文介绍了我可以在cuda卡上实际分配多少内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在写一个服务器进程,使用cuda在GPU上执行计算。我想排队等待进来的请求,直到设备上有足够的内存可以运行作业,但我很难知道我可以在设备上分配多少内存。我有一个很好的估计一个工作需要多少内存,(至少从cudaMalloc()分配多少),但我得到设备内存很久,在我分配的全局内存可用总量。 / p>

有没有一个公式的王从总的全局内存中计算我可以分配的数量?我可以玩它,直到我得到一个根据经验的估计,但我担心我的客户会部署不同的卡在某一点,我的杰瑞操纵的数字将无法正常工作。

解决方案

GPU的大小是您可以通过分配的内存量的上限, cudaMalloc ,但不能保证CUDA运行时可以满足对单个大分配中的所有请求,或者甚至一系列小的分配。



内存分配的约束因操作系统的底层驱动程序模型的细节而异。例如,如果所讨论的GPU是主显示设备,则OS可能还保留用于图形的GPU的存储器的一部分。运行时使用的其他隐式状态(例如堆)也消耗内存资源。



CUDART API函数 cudaMemGetInfo 报告可用内存的空闲和总量。据我所知,没有类似的API调用可以报告最大的可满足分配请求的大小。


I'm writing a server process that performs calculations on a GPU using cuda. I want to queue up in-coming requests until enough memory is available on the device to run the job, but I'm having a hard time figuring out how much memory I can allocate on the the device. I have a pretty good estimate of how much memory a job requires, (at least how much will be allocated from cudaMalloc()), but I get device out of memory long before I've allocated the total amount of global memory available.

Is there some king of formula to compute from the total global memory the amount I can allocated? I can play with it until I get an estimate that works empirically, but I'm concerned my customers will deploy different cards at some point and my jerry-rigged numbers won't work very well.

解决方案

The size of your GPU's DRAM is an upper bound on the amount of memory you can allocate through cudaMalloc, but there's no guarantee that the CUDA runtime can satisfy a request for all of it in a single large allocation, or even a series of small allocations.

The constraints of memory allocation vary depending on the details of the underlying driver model of the operating system. For example, if the GPU in question is the primary display device, then it's possible that the OS has also reserved some portion of the GPU's memory for graphics. Other implicit state the runtime uses (such as the heap) also consumes memory resources. It's also possible that the memory has become fragmented and no contiguous block large enough to satisfy the request exists.

The CUDART API function cudaMemGetInfo reports the free and total amount of memory available. As far as I know, there's no similar API call which can report the size of the largest satisfiable allocation request.

这篇关于我可以在cuda卡上实际分配多少内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆