Python大量可变RAM使用情况 [英] Python large variable RAM usage

查看:81
本文介绍了Python大量可变RAM使用情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说一个dict变量在运行时会变得非常大-多达数百万个key:value对.

Say there is a dict variable that grows very large during runtime - up into millions of key:value pairs.

此变量是否存储在RAM中,有效地耗尽了所有可用内存并减慢了系统的其余部分的速度?

Does this variable get stored in RAM, effectively using up all the available memory and slowing down the rest of the system?

要求解释器显示整个字典是一个坏主意,但是只要一次访问一个键就可以了吗?

Asking the interpreter to display the entire dict is a bad idea, but would it be okay as long as one key is accessed at a time?

推荐答案

是的,字典将存储在进程内存中.因此,如果它变得足够大,以至于系统RAM中没有足够的空间,那么当系统开始在磁盘之间来回交换内存时,您可能会看到速度显着下降.

Yes, the dict will be stored in the process memory. So if it gets large enough that there's not enough room in the system RAM, then you can expect to see massive slowdown as the system starts swapping memory to and from disk.

其他人则说,几百万个物品应该不会造成问题;我不确定. dict开销本身(在计算键和值所占用的内存之前)非常重要.对于Python 2.6或更高版本, sys.getsizeof 提供了一些有用的信息关于各种Python结构占用了多少RAM.在64位OS X计算机上从Python 2.6获得了一些快速结果:

Others have said that a few million items shouldn't pose a problem; I'm not so sure. The dict overhead itself (before counting the memory taken by the keys and values) is significant. For Python 2.6 or later, sys.getsizeof gives some useful information about how much RAM various Python structures take up. Some quick results, from Python 2.6 on a 64-bit OS X machine:

>>> from sys import getsizeof
>>> getsizeof(dict((n, 0) for n in range(5462)))/5462.
144.03368729403149
>>> getsizeof(dict((n, 0) for n in range(5461)))/5461.
36.053470060428495

因此,该机器上的dict开销在每台机器36个字节和每个项目144个字节之间变化(确切值取决于字典内部哈希表的填充程度;此处5461 = 2 ** 14//3是其中之一)内部哈希表被放大的阈值).那是在增加字典项目本身的开销之前;如果它们都是短字符串(比如说少于6个字符),那么每个项目仍然会增加另一个== 80字节(如果许多不同的键共享相同的值,则可能会更少).

So the dict overhead varies between 36 bytes per item and 144 bytes per item on this machine (the exact value depending on how full the dictionary's internal hash table is; here 5461 = 2**14//3 is one of the thresholds where the internal hash table is enlarged). And that's before adding the overhead for the dict items themselves; if they're all short strings (6 characters or less, say) then that still adds another >= 80 bytes per item (possibly less if many different keys share the same value).

因此, 数百万个字典项目不需要花费一台典型计算机上的RAM.

So it wouldn't take that many million dict items to exhaust RAM on a typical machine.

这篇关于Python大量可变RAM使用情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆