如何使用CUDA刷新GPU内存(物理复位不可用) [英] How can I flush GPU memory using CUDA (physical reset is unavailable)

查看:1455
本文介绍了如何使用CUDA刷新GPU内存(物理复位不可用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的CUDA程序在执行期间崩溃,内存被刷新之前。

My CUDA program crashed during execution, before memory was flushed. As a result, device memory remained occupied.

我在GTX 580上运行, nvidia-smi --gpu-reset 不支持。

I'm running on a GTX 580, for which nvidia-smi --gpu-reset is not supported.

在程序开头放置 cudaDeviceReset()影响了进程创建的当前上下文,并且不刷新它之前分配的内存。

Placing cudaDeviceReset() in the beginning of the program is only affecting the current context created by the process and doesn't flush the memory allocated before it.

我正在使用GPU远程访问Fedora服务器,所以物理复位相当复杂。

I'm accessing a Fedora server with that GPU remotely, so physical reset is quite complicated.

因此,问题是 - 在这种情况下是否有办法刷新设备内存?

So, the question is - Is there any way to flush the device memory in this situation?

推荐答案

虽然在除了异常情况之外的其他任何情况下都不必要,但在Linux主机上执行此操作的推荐方法是通过执行

Although it should be unecessary to do this in anything other than exceptional circumstances, the recommended way to do this on linux hosts is to unload the nvidia driver by doing

$ rmmod nvidia 

然后重新载入

$ modprobe nvidia

如果机器正在运行X11,您需要事先手动停止,然后重新启动。驱动程序初始化过程应该消除设备上的任何先前的状态。

If the machine is running X11, you will need to stop this manually beforehand, and restart it afterwards. The driver intialisation processes should eliminate any prior state on the device.

这个回答是从评论汇总并发布为社区wiki,未回答的CUDA标记列表

这篇关于如何使用CUDA刷新GPU内存(物理复位不可用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆