GPU RAM已占用,但没有PID [英] GPU RAM occupied but no PIDs

查看:161
本文介绍了GPU RAM已占用,但没有PID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

nvidia-smi 显示如下,表明在GPU0上使用了3.77GB,但未列出GPU0的进程:

The nvidia-smi shows following indicating 3.77GB utilized on GPU0 but no processes are listed for GPU0:

(base) ~/.../fast-autoaugment$ nvidia-smi
Fri Dec 20 13:48:12 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp            Off  | 00000000:03:00.0 Off |                  N/A |
| 23%   34C    P8     9W / 250W |   3771MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp            Off  | 00000000:84:00.0  On |                  N/A |
| 38%   62C    P8    24W / 250W |   2295MiB / 12188MiB |      8%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    1      1910      G   /usr/lib/xorg/Xorg                           105MiB |
|    1      2027      G   /usr/bin/gnome-shell                          51MiB |
|    1      3086      G   /usr/lib/xorg/Xorg                          1270MiB |
|    1      3237      G   /usr/bin/gnome-shell                         412MiB |
|    1     30593      G   /proc/self/exe                               286MiB |
|    1     31849      G   ...quest-channel-token=4371017438329004833   164MiB |
+-----------------------------------------------------------------------------+

类似地, nvtop 显示相同的GPU RAM利用率,但是其列出的进程显示 TYPE = Compute ,如果尝试杀死PID,则显示错误:

Similarly nvtop shows same GPU RAM utilization but the processes it lists shows TYPE=Compute and if you try to kill PIDs it shows then you get error:

(base) ~/.../fast-autoaugment$ kill 27761
bash: kill: (27761) - No such process

如何回收看似鬼进程占用的GPU RAM?

How to reclaim GPU RAM occupied by apparently ghost processes?

推荐答案

使用以下命令深入了解占用GPU RAM的幻影进程:

Use following command to get insight into ghost processes occupying GPU RAM:

sudo fuser -v /dev/nvidia*

在我的情况下,输出为:

In my case, output is:

(base) ~/.../fast-autoaugment$ sudo fuser -v /dev/nvidia*
                     USER        PID ACCESS COMMAND
/dev/nvidia0:        shitals     517 F.... nvtop
                     root       1910 F...m Xorg
                     gdm        2027 F.... gnome-shell
                     root       3086 F...m Xorg
                     shitals    3237 F.... gnome-shell
                     shitals   27808 F...m python
                     shitals   27809 F...m python
                     shitals   27813 F...m python
                     shitals   27814 F...m python
                     shitals   28091 F...m python
                     shitals   28092 F...m python
                     shitals   28096 F...m python

这显示了nvidia-smi和nvtop无法显示的进程.我杀死了所有 python 进程后,就释放了GPU RAM.

This shows processes that nvidia-smi as well as nvtop fails to shows. After I killed all of the python processes, the GPU RAM was freed up.

另一种尝试的方法是使用以下命令重置GPU:

Another thing to try is to reset GPU using the command:

sudo nvidia-smi --gpu-reset -i 0

这篇关于GPU RAM已占用,但没有PID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆