“sys.getsizeof(int)"返回一个不合理的大值? [英] "sys.getsizeof(int)" returns an unreasonably large value?

查看:53
本文介绍了“sys.getsizeof(int)"返回一个不合理的大值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在python中检查int数据类型的大小:

导入系统sys.getsizeof(int)

结果是436",这对我来说没有意义.无论如何,我想知道我的机器上 int 会占用多少字节(2,4,..?).

解决方案

简答

您获得的是的大小,而不是类的实例.调用 int 获取实例的大小:

<预><代码>>>>sys.getsizeof(int())24

如果这个大小看起来仍然有点大,请记住 Python int 与(例如)c 中的 int 非常不同.在 Python 中,int 是一个成熟的对象.这意味着有额外的开销.

除了其他存储之外,每个 Python 对象至少包含一个引用计数和对对象类型的引用;在 64 位机器上,占用 16 个字节!int 内部结构(由标准 CPython 实现决定)也随着时间的推移而变化,因此所占用的额外存储量取决于您的版本.

Python 2 和 3 中 int 对象的一些细节

这是 Python 2 中的情况.(其中一些改编自 洛朗·卢斯).整数对象表示为具有以下结构的内存块:

typedef struct {PyObject_HEAD长 ob_ival;} PyIntObject;

PyObject_HEAD 是定义引用计数和对象类型存储的宏.文档对其进行了详细描述,代码可以在这个答案.

内存是按大块分配的,因此每个新整数都不存在分配瓶颈.块的结构如下所示:

struct _intblock {结构_intblock *下一个;PyIntObject 对象[N_INTOBJECTS];};typedef struct _intblock PyIntBlock;

这些一开始都是空的.然后,每次创建一个新整数时,Python 使用 next 指向的内存并递增 next 以指向块中的下一个空闲整数对象.

我不完全确定一旦超出普通整数的存储容量会如何变化,但是一旦这样做,int 的大小会变大.在我的机器上,在 Python 2 中:

<预><代码>>>>sys.getsizeof(0)24>>>sys.getsizeof(1)24>>>sys.getsizeof(2 ** 62)24>>>sys.getsizeof(2 ** 63)36

在 Python 3 中,我认为一般情况是一样的,但整数的大小以更零碎的方式增加:

<预><代码>>>>sys.getsizeof(0)24>>>sys.getsizeof(1)28>>>sys.getsizeof(2 ** 30 - 1)28>>>sys.getsizeof(2 ** 30)32>>>sys.getsizeof(2 ** 60 - 1)32>>>sys.getsizeof(2 ** 60)36

当然,这些结果都是依赖于硬件的!天啊.

Python 3 中整数大小的可变性暗示它们的行为可能更像可变长度类型(如列表).事实上,事实证明这是真的.这是 C struct<的定义/a> 用于 Python 3 中的 int 对象:

struct _longobject {PyObject_VAR_HEAD数字 ob_digit[1];};

伴随此定义的评论总结了 Python3 的整数表示.零不是由存储值表示,而是由大小为零的对象表示(这就是为什么 sys.getsizeof(0)24 字节而 sys.getsizeof(1)28).负数由具有负大小属性的对象表示!好奇怪.

I want to check the size of int data type in python:

import sys
sys.getsizeof(int)

It comes out to be "436", which doesn't make sense to me. Anyway, I want to know how many bytes (2,4,..?) int will take on my machine.

解决方案

The short answer

You're getting the size of the class, not of an instance of the class. Call int to get the size of an instance:

>>> sys.getsizeof(int())
24

If that size still seems a little bit large, remember that a Python int is very different from an int in (for example) c. In Python, an int is a fully-fledged object. This means there's extra overhead.

Every Python object contains at least a refcount and a reference to the object's type in addition to other storage; on a 64-bit machine, that takes up 16 bytes! The int internals (as determined by the standard CPython implementation) have also changed over time, so that the amount of additional storage taken depends on your version.

Some details about int objects in Python 2 and 3

Here's the situation in Python 2. (Some of this is adapted from a blog post by Laurent Luce). Integer objects are represented as blocks of memory with the following structure:

typedef struct {
    PyObject_HEAD
    long ob_ival;
} PyIntObject;

PyObject_HEAD is a macro defining the storage for the refcount and the object type. It's described in some detail by the documentation, and the code can be seen in this answer.

The memory is allocated in large blocks so that there's not an allocation bottleneck for every new integer. The structure for the block looks like this:

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];
};
typedef struct _intblock PyIntBlock;

These are all empty at first. Then, each time a new integer is created, Python uses the memory pointed at by next and increments next to point to the next free integer object in the block.

I'm not entirely sure how this changes once you exceed the storage capacity of an ordinary integer, but once you do so, the size of an int gets larger. On my machine, in Python 2:

>>> sys.getsizeof(0)
24
>>> sys.getsizeof(1)
24
>>> sys.getsizeof(2 ** 62)
24
>>> sys.getsizeof(2 ** 63)
36

In Python 3, I think the general picture is the same, but the size of integers increases in a more piecemeal way:

>>> sys.getsizeof(0)
24
>>> sys.getsizeof(1)
28
>>> sys.getsizeof(2 ** 30 - 1)
28
>>> sys.getsizeof(2 ** 30)
32
>>> sys.getsizeof(2 ** 60 - 1)
32
>>> sys.getsizeof(2 ** 60)
36

These results are, of course, all hardware-dependent! YMMV.

The variability in integer size in Python 3 is a hint that they may behave more like variable-length types (like lists). And indeed, this turns out to be true. Here's the definition of the C struct for int objects in Python 3:

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

The comments that accompany this definition summarize Python 3's representation of integers. Zero is represented not by a stored value, but by an object with size zero (which is why sys.getsizeof(0) is 24 bytes while sys.getsizeof(1) is 28). Negative numbers are represented by objects with a negative size attribute! So weird.

这篇关于“sys.getsizeof(int)"返回一个不合理的大值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆