PyYAML的内存泄漏 [英] Memory leak with PyYAML

查看:76
本文介绍了PyYAML的内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为在使用库PyYAML加载 .yml文件时出现内存泄漏.

I think that I'm having a memory leak when loading an .yml file with the library PyYAML.

我已按照以下步骤操作:

I've followed the next steps:

import yaml
d = yaml.load(open(filename, 'r'))

该进程使用的内存(我用 top htop 将该内存)从 60K 增长到了 160M,而文件大小小于 1M .

The memory used by the process (I've gotten it with top or htop) has grown from 60K to 160M while the size of the file is lower than 1M.

然后,我完成了下一个命令:

Then, I've done the next command:

sys.getsizeof(d)

它返回的值小于 400K .

我还尝试过将垃圾收集器与 gc.collect()一起使用,但是什么也没发生.

I've also tried to use the garbage collector with gc.collect(), but nothing has happened.

如您所见,似乎存在内存泄漏,但是我不知道是什么原因导致的,也不知道如何释放此数量的内存.

As you can see, it seems that there's a memory leak, but I don't know what is producing it, neither I know how to free this amount of memory.

有什么主意吗?

推荐答案

您的方法没有显示内存泄漏,它只是表明PyYAML在处理中等大小的YAML文件时会占用大量内存.

Your approach doesn't show a memory leak, it just shows that PyYAML uses a lot of memory while processing a moderately sized YAML file.

如果愿意的话:

import yaml
X = 10
for x in range(X):
    d = yaml.safe_load(open(filename, 'r'))

并且程序所使用的内存大小将根据您将 X 设置为何种大小而变化,因此有理由假定内存泄漏.

And the memory size used by the program would change depending on what you set X to, then there is reason to assume there is a memory leak.

在我运行的测试中,情况并非如此.只是默认的Loader和SafeLoader占用的内存约为文件大小的330倍(基于简单的任意1Mb大小,即没有标签,YAML文件),而CLoader的文件大小约为145倍.

In tests that I ran this is not the case. It is just that the default Loader and SafeLoader take about 330x the filesize in memory (based on an arbitrary 1Mb size simple, i.e. no tags, YAML file) and the CLoader about 145x that filesize.

多次加载YAML数据并不会增加这种情况,因此 load()会返还它使用的内存,这意味着没有内存泄漏.

Loading the YAML data multiple times doesn't increase that, so load() gives back the memory it uses, which means there is no memory leak.

这并不是说看起来像是大量的开销.

That is not to say that it looks like an enormous amount of overhead.

(我使用的是 safe_load(),因为PyYAML的文档表明 load()在不受控制的输入文件上并不安全)

(I am using safe_load() as PyYAML's documentation indicate that load() is not safe on uncontrolled input files).

这篇关于PyYAML的内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆