GPU内存在CUDA脚本执行后不释放自己 [英] GPU Memory not freeing itself after CUDA script execution
问题描述
我在执行CUDA脚本后遇到了显卡保留内存的问题(即使使用cudaFree())。
I am having an issue with my Graphics card retaining memory after the execution of a CUDA script (even with the use of cudaFree()).
启动时总共使用的内存大约是128MB,但在脚本运行后,它在执行内存时会耗尽内存。
On boot the Total Used memory is about 128MB but after the script runs it runs out of memory mid execution.
nvidia-sma:
nvidia-sma:
+------------------------------------------------------+
| NVIDIA-SMI 340.29 Driver Version: 340.29 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 660 Ti Off | 0000:01:00.0 N/A | N/A |
| 10% 43C P0 N/A / N/A | 2031MiB / 2047MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
有没有办法释放这个记忆,而不重新启动,也许是一个终端命令?
Is there any way to free this memory back up without rebooting, perhaps a terminal command?
如果我没有在CUDA脚本中正确地管理我的内存,或者这个内存在脚本停止/退出时是否自动释放自己呢?
Also is this normal behaviour if I am not managing my memory correctly in a CUDA script, or should this memory be automatically freeing itself when the script stops / is quit?
感谢
感谢。
推荐答案
CUDA运行时API自动注册一个拆卸函数,该函数将破坏CUDA上下文并释放应用程序所占用的任何GPU资源使用。只要应用程序隐式或显式调用 exit()
,则不需要进一步的用户操作来释放GPU内存等资源。
The CUDA runtime API automatically registers a teardown function which will destroy the CUDA context and release any GPU resources which the application was using. As long as the application implicitly or explicitly calls exit()
, then no further user action is required free resources like GPU memory.
如果你确实发现内存在运行CUDA代码时似乎没有被释放,那么通常的嫌疑人被挂起或者那个或其他代码的后台实例从未调用过 exit()
并且从不销毁他们的上下文。这是在这种情况下的原因。
If you do find that memory doesn't seem to be released when running a CUDA code, then the usual suspect is suspended or background instances of that or other code which has never called exit()
and never destroyed their context. That was the cause in this case.
NVIDIA提供一个API函数 cudaDeviceReset
呼叫的时间。通常不必在设计良好的CUDA代码中使用此函数,而应尝试确保有一个干净的 exit()
或返回路径 main()
。这将确保调用运行时库并释放资源的上下文破坏处理程序。
NVIDIA do provide an API function cudaDeviceReset
, which will initiate context destruction at the time of the call. It shouldn't usually be necessary to use this function in well designed CUDA code, rather you should try and ensure that there is a clean exit()
or return path from main()
in your program. This will ensure that the context destruction handler which the runtime library is called and resources are freed.
这篇关于GPU内存在CUDA脚本执行后不释放自己的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!