了解Python中大整数的内存分配 [英] Understanding memory allocation for large integers in Python

查看:198
本文介绍了了解Python中大整数的内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python如何为大整数分配内存?



int 类型的大小为 28字节并且随着我不断增加 int 的值,大小以 4字节为增量增加


  1. 为什么 28字节最初任何低至 1 的价值?


  2. 为什么增量 4字节


PS:我在x86_64(64位机器)上运行Python 3.5.2。关于(3.0+)解释器如何处理如此巨大数字的任何指针/资源/ PEP都是我正在寻找的。

代码说明大小:

 >>> a = 1 
>>> print(a .__ sizeof __())
28
>>> a = 1024
>>> print(a .__ sizeof __())
28
>>> a = 1024 * 1024 * 1024
>>> print(a .__ sizeof __())
32
>>> a = 1024 * 1024 * 1024 * 1024
>>> print(a .__ sizeof __())
32
>>> a = 1024 * 1024 * 1024 * 1024 * 1024 * 1024
>>> a
1152921504606846976
>>> print(a .__ sizeof __())
36


解决方案

< blockquote>

为什么 28 字节最初的任何值低至 1


我相信 @bgusach回答了这个问题彻底; Python使用 C 结构来表示Python世界中的对象,任何对象包括 int s

  struct _longobject {
PyObject_VAR_HEAD
digit ob_digit [1];
};

PyObject_VAR_HEAD 是一个宏,在展开时会在结构中添加另一个字段(字段 PyVarObject 具体用于具有一些长度概念的对象,并且, ob_digits 是一个包含数字值的数组。对于小的大的Python数字,锅炉板尺寸来自该结构。


为什么增量 4 字节?


因为,当创建一个更大的数字时,大小(以字节为单位)是 sizeof(数字)的倍数;你可以在 _PyLong_New 其中使用 PyObject_MALLOC 执行新 longobject 的内存分配:

  / *所需的字节数为:offsetof(PyLongObject,ob_digit)+ 
sizeof (数字)*的大小。此代码的先前版本使用
sizeof(PyVarObject)而不是offsetof,但是在PyVarObject头
和数字之间存在填充时,这可能是
不正确。 * /
if(size>(Py_ssize_t)MAX_LONG_DIGITS){
PyErr_SetString(PyExc_OverflowError,
整数位数过多);
返回NULL;
}
result = PyObject_MALLOC(offsetof(PyLongObject,ob_digit)+
size * sizeof(digit));

offsetof(PyLongObject,ob_digit)是长对象的'样板'(以字节为单位),与保持其值无关。



数字在头文件中定义,持有 struct _longobject 作为 typedef for uint32

  typedef uint32_t digit; 

sizeof(uint32_t) 4 字节。这是当 _PyLong_New _PyLong_New 参数增加时,您将看到以字节为单位的大小增加的数量。






当然,这就是Python选择实施的 C 的方式它。这是一个实现细节,因此您不会在PEP中找到太多信息。如果你能找到相应的线程,python-dev邮件列表将进行实现讨论:-)。



无论哪种方式,您可能会在其他流行的实现中发现不同的行为,因此不要认为这是理所当然的。


How does Python allocate memory for large integers?

An int type has a size of 28 bytes and as I keep increasing the value of the int, the size increases in increments of 4 bytes.

  1. Why 28 bytes initially for any value as low as 1?

  2. Why increments of 4 bytes?

PS: I am running Python 3.5.2 on a x86_64 (64 bit machine). Any pointers/resources/PEPs on how the (3.0+) interpreters work on such huge numbers is what I am looking for.

Code illustrating the sizes:

>>> a=1
>>> print(a.__sizeof__())
28
>>> a=1024
>>> print(a.__sizeof__())
28
>>> a=1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024*1024*1024
>>> a
1152921504606846976
>>> print(a.__sizeof__())
36

解决方案

Why 28 bytes initially for any value as low as 1?

I believe @bgusach answered that completely; Python uses C structs to represent objects in the Python world, any objects including ints:

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

PyObject_VAR_HEAD is a macro that when expanded adds another field in the struct (field PyVarObject which is specifically used for objects that have some notion of length) and, ob_digits is an array holding the value for the number. Boiler-plate in size comes from that struct, for small and large Python numbers.

Why increments of 4 bytes?

Because, when a larger number is created, the size (in bytes) is a multiple of the sizeof(digit); you can see that in _PyLong_New where the allocation of memory for a new longobject is performed with PyObject_MALLOC:

/* Number of bytes needed is: offsetof(PyLongObject, ob_digit) +
   sizeof(digit)*size.  Previous incarnations of this code used
   sizeof(PyVarObject) instead of the offsetof, but this risks being
   incorrect in the presence of padding between the PyVarObject header
   and the digits. */
if (size > (Py_ssize_t)MAX_LONG_DIGITS) {
    PyErr_SetString(PyExc_OverflowError,
                    "too many digits in integer");
    return NULL;
}
result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) +
                         size*sizeof(digit));

offsetof(PyLongObject, ob_digit) is the 'boiler-plate' (in bytes) for the long object that isn't related with holding its value.

digit is defined in the header file holding the struct _longobject as a typedef for uint32:

typedef uint32_t digit;

and sizeof(uint32_t) is 4 bytes. That's the amount by which you'll see the size in bytes increase when the size argument to _PyLong_New increases.


Of course, this is just how CPython has chosen to implement it. It is an implementation detail and as such you wont find much information in PEPs. The python-dev mailing list would hold implementation discussions if you can find the corresponding thread :-).

Either way, you might find differing behavior in other popular implementations, so don't take this one for granted.

这篇关于了解Python中大整数的内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆