了解Python中大整数的内存分配 [英] Understanding memory allocation for large integers in Python
问题描述
Python如何为大整数分配内存?
int
类型的大小为 28字节
并且随着我不断增加 int
的值,大小以 4字节为增量增加
。
-
为什么
28字节
最初任何低至1
的价值? -
为什么增量
4字节
?
PS:我在x86_64(64位机器)上运行Python 3.5.2。关于(3.0+)解释器如何处理如此巨大数字的任何指针/资源/ PEP都是我正在寻找的。 p>
代码说明大小:
>>> a = 1
>>> print(a .__ sizeof __())
28
>>> a = 1024
>>> print(a .__ sizeof __())
28
>>> a = 1024 * 1024 * 1024
>>> print(a .__ sizeof __())
32
>>> a = 1024 * 1024 * 1024 * 1024
>>> print(a .__ sizeof __())
32
>>> a = 1024 * 1024 * 1024 * 1024 * 1024 * 1024
>>> a
1152921504606846976
>>> print(a .__ sizeof __())
36
< blockquote>
为什么 28
字节最初的任何值低至 1
?
我相信 @bgusach回答了这个问题彻底; Python使用 C
结构来表示Python世界中的对象,任何对象包括 int
s :
struct _longobject {
PyObject_VAR_HEAD
digit ob_digit [1];
};
PyObject_VAR_HEAD
是一个宏,在展开时会在结构中添加另一个字段(字段 PyVarObject
具体用于具有一些长度概念的对象,并且, ob_digits
是一个包含数字值的数组。对于小的和大的Python数字,锅炉板尺寸来自该结构。
为什么增量
4
字节?
因为,当创建一个更大的数字时,大小(以字节为单位)是 sizeof(数字)
的倍数;你可以在 _PyLong_New
其中使用 PyObject_MALLOC
执行新 longobject
的内存分配:
/ *所需的字节数为:offsetof(PyLongObject,ob_digit)+
sizeof (数字)*的大小。此代码的先前版本使用
sizeof(PyVarObject)而不是offsetof,但是在PyVarObject头
和数字之间存在填充时,这可能是
不正确。 * /
if(size>(Py_ssize_t)MAX_LONG_DIGITS){
PyErr_SetString(PyExc_OverflowError,
整数位数过多);
返回NULL;
}
result = PyObject_MALLOC(offsetof(PyLongObject,ob_digit)+
size * sizeof(digit));
offsetof(PyLongObject,ob_digit)
是长对象的'样板'(以字节为单位),与保持其值无关。
数字
在头文件中定义,持有 struct _longobject
作为 typedef
for uint32
:
typedef uint32_t digit;
和 sizeof(uint32_t)
是 4
字节。这是当 _PyLong_New
的 _PyLong_New
参数增加时,您将看到以字节为单位的大小增加的数量。
当然,这就是Python选择实施的 C
的方式它。这是一个实现细节,因此您不会在PEP中找到太多信息。如果你能找到相应的线程,python-dev邮件列表将进行实现讨论:-)。
无论哪种方式,您可能会在其他流行的实现中发现不同的行为,因此不要认为这是理所当然的。
How does Python allocate memory for large integers?
An int
type has a size of 28 bytes
and as I keep increasing the value of the int
, the size increases in increments of 4 bytes
.
Why
28 bytes
initially for any value as low as1
?Why increments of
4 bytes
?
PS: I am running Python 3.5.2 on a x86_64 (64 bit machine). Any pointers/resources/PEPs on how the (3.0+) interpreters work on such huge numbers is what I am looking for.
Code illustrating the sizes:
>>> a=1
>>> print(a.__sizeof__())
28
>>> a=1024
>>> print(a.__sizeof__())
28
>>> a=1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024*1024*1024
>>> a
1152921504606846976
>>> print(a.__sizeof__())
36
Why
28
bytes initially for any value as low as1
?
I believe @bgusach answered that completely; Python uses C
structs to represent objects in the Python world, any objects including int
s:
struct _longobject {
PyObject_VAR_HEAD
digit ob_digit[1];
};
PyObject_VAR_HEAD
is a macro that when expanded adds another field in the struct (field PyVarObject
which is specifically used for objects that have some notion of length) and, ob_digits
is an array holding the value for the number. Boiler-plate in size comes from that struct, for small and large Python numbers.
Why increments of
4
bytes?
Because, when a larger number is created, the size (in bytes) is a multiple of the sizeof(digit)
; you can see that in _PyLong_New
where the allocation of memory for a new longobject
is performed with PyObject_MALLOC
:
/* Number of bytes needed is: offsetof(PyLongObject, ob_digit) +
sizeof(digit)*size. Previous incarnations of this code used
sizeof(PyVarObject) instead of the offsetof, but this risks being
incorrect in the presence of padding between the PyVarObject header
and the digits. */
if (size > (Py_ssize_t)MAX_LONG_DIGITS) {
PyErr_SetString(PyExc_OverflowError,
"too many digits in integer");
return NULL;
}
result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) +
size*sizeof(digit));
offsetof(PyLongObject, ob_digit)
is the 'boiler-plate' (in bytes) for the long object that isn't related with holding its value.
digit
is defined in the header file holding the struct _longobject
as a typedef
for uint32
:
typedef uint32_t digit;
and sizeof(uint32_t)
is 4
bytes. That's the amount by which you'll see the size in bytes increase when the size
argument to _PyLong_New
increases.
Of course, this is just how C
Python has chosen to implement it. It is an implementation detail and as such you wont find much information in PEPs. The python-dev mailing list would hold implementation discussions if you can find the corresponding thread :-).
Either way, you might find differing behavior in other popular implementations, so don't take this one for granted.
这篇关于了解Python中大整数的内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!