python:类vs元组巨大的内存开销(?) [英] python: class vs tuple huge memory overhead (?)

查看:115
本文介绍了python:类vs元组巨大的内存开销(?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将大量复杂数据存储在元组/列表中,但更喜欢使用小型包装器类来使数据结构更易于理解,例如

I'm storing a lot of complex data in tuples/lists, but would prefer to use small wrapper classes to make the data structures easier to understand, e.g.

class Person:
    def __init__(self, first, last):
        self.first = first
        self.last = last

p = Person('foo', 'bar')
print(p.last)
...

胜于

p = ['foo', 'bar']
print(p[1])
...

但是似乎存在可怕的内存开销:

however there seems to be a horrible memory overhead:

l = [Person('foo', 'bar') for i in range(10000000)]
# ipython now taks 1.7 GB RAM

del l
l = [('foo', 'bar') for i in range(10000000)]
# now just 118 MB RAM

为什么?有没有我没有想到的明显替代解决方案?

Why? is there any obvious alternative solution that I didn't think of?

谢谢!

(我知道,在这个示例中,包装器"类看起来很傻.但是当数据变得更加复杂和嵌套时,它会更有用)

(I know, in this example the 'wrapper' class looks silly. But when the data becomes more complex and nested, it is more useful)

推荐答案

正如其他人在回答中所说的那样,您必须生成不同的对象以进行比较才有意义.

As others have said in their answers, you'll have to generate different objects for the comparison to make sense.

因此,让我们比较一些方法.

So, let's compare some approaches.

l = [(i, i) for i in range(10000000)]
# memory taken by Python3: 1.0 GB

class Person

class Person:
    def __init__(self, first, last):
        self.first = first
        self.last = last

l = [Person(i, i) for i in range(10000000)]
# memory: 2.0 GB

namedtuple(tuple + __slots__)

namedtuple (tuple + __slots__)

from collections import namedtuple
Person = namedtuple('Person', 'first last')

l = [Person(i, i) for i in range(10000000)]
# memory: 1.1 GB

namedtuple基本上是扩展tuple并将所有名称字段都使用__slots__的类,但是它添加了字段getter和其他一些辅助方法(如果使用verbose=True调用,则可以看到生成的确切代码)

namedtuple is basically a class that extends tuple and uses __slots__ for all named fields, but it adds fields getters and some other helper methods (you can see the exact code generated if called with verbose=True).

class Person:
    __slots__ = ['first', 'last']
    def __init__(self, first, last):
        self.first = first
        self.last = last

l = [Person(i, i) for i in range(10000000)]
# memory: 0.9 GB

这是上面的namedtuple的精简版本.一个明显的赢家,甚至比纯元组还好.

This is a trimmed-down version of namedtuple above. A clear winner, even better than pure tuples.

这篇关于python:类vs元组巨大的内存开销(?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆