为什么我的类需要这么多内存? [英] Why does my class cost so much memory?

查看:51
本文介绍了为什么我的类需要这么多内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from guppy import hpyhp = hpy()类演示(对象):__slots__ = ('v0', 'v1')def __init__(self, v0, v1):self.v0 = v0self.v1 = v1从数组导入数组值 = 1.01ar = 数组('f')ar2 = 数组('f')对于我在范围内(5000000):ar.append(value + i)ar2.append(value + i * 0.1 + i * 0.01 + i * 0.001 + i * 0.0001 + i * 0.000001)a = []对于我在范围内(5000000):vex = Demo(ar[i], ar[2])a.append(vex)打印函数末尾的堆",hp.heap()

输出如下:

函数末尾的堆 15063247个对象集的分区.总大小 = 650251664 字节.索引计数 % 大小 % 累积 % 种类(类/类的字典)0 5000000 33 320000000 49 320000000 49 __main__.Demo1 10000108 66 240002592 37 560002592 86 浮动2 368 0 42008896 6 602011488 93 名单3 2 0 40000112 6 642011600 99 array.array4 28182 0 2214784 0 644226384 995 12741 0 1058448 0 645284832 99 元组6 189 0 669624 0 645954456 99 模块字典7 371 0 588104 0 646542560 99 dict(无所有者)8 258 0 509232 0 647051792 100 sip.wrappertype 字典9 3176 0 406528 0 647458320 100 种类型.代码类型

我想知道为什么 Demo 类需要这么多内存.因为 Demo 类只保留对浮点数的引用,它不会复制浮点数.

getSizeOf(Demo) # 984

50W 的 Demo 类可能只是消耗内存:984*50W=40215176 但是,现在花费 320000000.难以置信,为什么?

解决方案

sys.getsizeof() 不会递归到子对象中,您只取类的大小,不是一个实例.每个实例占用 64 个字节,每个 float 对象加上 24 个字节(在 OS X 上,使用 Python 2.7.12):

<预><代码>>>>d = 演示(1.0, 2.0)>>>sys.getsizeof(d)64>>>sys.getsizeof(d.v0)24>>>sys.getsizeof(d) + sys.getsizeof(d.v0) + sys.getsizeof(d.v1)112

每个槽只为实例对象中的一个指针保留内存;在我的机器上,每个指针 8 个字节.

Demo() 实例和数组之间有几个区别:

  • 实例具有最小的开销来支持引用计数和弱引用,并且包含一个指向它们的类的指针.数组直接存储值,没有任何开销.
  • 实例存储 Python 浮点数.这些是成熟的对象,包括引用计数和弱引用支持.该数组将单精度浮点数存储为 C 值,而 Python float 对象模型双精度 精度浮​​点数.因此,该实例仅使用 2 * 24 字节(在我的 Mac 上)用于那些浮点数,而数组中的每个单精度 'f' 值仅使用 4 个字节.
  • 要跟踪 500 万个 Demo 实例,您还需要创建一个 list 对象,该对象的大小可以处理至少 500 万个对象参考.array 直接存储 C 单精度浮点数.

hp.heap() 输出只计算实例占用空间,而不是每行引用的 float 值,但总数匹配:

  • 500 万次 64 字节是 Demo 实例的 320.000.000 字节内存.
  • 1000 万次 24 字节是 float 实例的 240.000.000 字节内存,再加上其他地方引用的另外 108 个浮点数.

这两组共同构成了堆上 1500 万个 Python 对象中的大部分.

  • 您创建的用于保存实例的 list 对象包含 500 万个指针,即指向所有 Demo 实例的 40.000.000 字节,加上用于那个对象.堆上还有 367 个列表,由其他 Python 代码引用.
  • 2 个 array 实例,每 500 万个 4 字节浮点数为 40.000.000 字节,加上每个数组开销 56 字节.

所以 array 对象在存储大量数值时效率更高,因为它将这些值存储为原始 C 值.但是,缺点是 Python 必须装箱您尝试访问的每个值;所以访问 ar[10] 会返回一个 Python float 对象.

from guppy import hpy

hp = hpy()


class Demo(object):
    __slots__ = ('v0', 'v1')

    def __init__(self, v0, v1):
        self.v0 = v0
        self.v1 = v1


from array import array

value = 1.01
ar = array('f')
ar2 = array('f')
for i in range(5000000):
    ar.append(value + i)
    ar2.append(value + i * 0.1 + i * 0.01 + i * 0.001 + i * 0.0001 + i * 0.000001)
a = []
for i in range(5000000):
    vex = Demo(ar[i], ar[2])
    a.append(vex)
print "Heap at the end of the functionn", hp.heap()

Here is the output:

Heap at the end of the functionn Partition of a set of 15063247 objects. Total       size = 650251664 bytes.

Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0 5000000  33 320000000  49 320000000  49 __main__.Demo
1 10000108  66 240002592  37 560002592  86 float
2    368   0 42008896   6 602011488  93 list
3      2   0 40000112   6 642011600  99 array.array
4  28182   0  2214784   0 644226384  99 str
5  12741   0  1058448   0 645284832  99 tuple
6    189   0   669624   0 645954456  99 dict of module
7    371   0   588104   0 646542560  99 dict (no owner)
8    258   0   509232   0 647051792 100 dict of sip.wrappertype
9   3176   0   406528   0 647458320 100 types.CodeType

I am wondering why the Demo class cost so much memory. Because Demo class just keeps a reference for the float, it doesn't copy the float value.

getSizeOf(Demo) # 984

50W of Demo class maybe just cost memory: 984*50W=40215176 but, now costs 320000000. It is unbelievable, why?

解决方案

sys.getsizeof() doesn't recurse into sub-objects, and you only took the size of the class, not of an instance. Each instance takes up 64 bytes, plus 24 bytes per float object (on OS X, using Python 2.7.12):

>>> d = Demo(1.0, 2.0)
>>> sys.getsizeof(d)
64
>>> sys.getsizeof(d.v0)
24
>>> sys.getsizeof(d) + sys.getsizeof(d.v0) + sys.getsizeof(d.v1)
112

Each slot only reserves memory for a pointer in the instance object; on my machine that's 8 bytes per pointer.

There are several differences between your Demo() instances and the array:

  • Instances have a minimal overhead to support reference counting and weak references, as well as contain a pointer to their class. The arrays store the values directly, without any of that overhead.
  • The instance stores Python floats. These are full-fledged objects, including reference counting and weak reference support. The array stores single precision floats as C values, while the Python float object models double precision floats. So the instance uses 2 * 24 bytes (on my Mac) just for those floats, vs. just 4 bytes per single-precision 'f' value in an array.
  • To track 5 million Demo instances, you also needed to create a list object, which is sized to handle at least 5 million object references. The array stores the C single-precision floats directly.

The hp.heap() output only counts the instance footprint, not the referenced float values on each line, but the totals match up:

  • 5 million times 64 bytes is 320.000.000 bytes of memory for the Demo instances.
  • 10 million times 24 bytes is 240.000.000 bytes of memory for the float instances, plus a further 108 floats referenced elsewhere.

Together, these two groups make up the majority of the 15 million Python objects on the heap.

  • The list object you created to hold the instances contains 5 million pointers, that's 40.000.000 bytes just to point to all the Demo instances, plus the accounting overhead for that object. There are a further 367 lists on the heap, referenced by other Python code.
  • 2 array instances with each 5 million 4-byte floats is 40.000.000 bytes, plus 56 bytes per array overhead.

So array objects are vastly more efficient to store a large number of numeric values, because it stores these as primitive C values. However, the disadvantage is that Python has to box each value you try to access; so accessing ar[10] returns a Python float object.

这篇关于为什么我的类需要这么多内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆