重新启动进程时未收回内存 [英] Memory not getting reclaimed on restart of a process

查看:90
本文介绍了重新启动进程时未收回内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python作业,可运行caffe网络以在nvidia GPU上进行图像处理。作业从Rabbitmq队列中获取图像,对其进行处理,然后将结果写入另一个队列。当我重新启动该作业时,进程将被终止,但内存未被回收。

I have a python job that runs a caffe net for image processing on nvidia GPUs. The job takes images from a rabbitmq queue, processes it and then writes the result in another queue. When I restart this job, the processes are getting killed but memory is not getting reclaimed.

因此,在经过一定数量的重新启动后,计算机崩溃了。一旦我杀死了工作,就不会在ps或top中运行任何python进程,但是不会回收CPU内存。

So after certain number of restarts the machine crashes. Once I kill the job there is no python process running in ps or top but the CPU memory is not getting reclaimed.

如何调试此问题?

编辑:CPU内存

推荐答案

这是您的GPU内存无法释放的原因。获取进程ID

It's your GPU memory which is not getting freed. Get the process id

$ nvidia-smi

,然后

$ kill -9 <process id>

这篇关于重新启动进程时未收回内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆