元组是如何在 CPython 中实现的? [英] How is tuple implemented in CPython?

查看:13
本文介绍了元组是如何在 CPython 中实现的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力学习 CPython 是如何在场景下实现的.Python 是高级别的很好,但我不喜欢把它当作一个黑匣子.

I've been trying to learn how CPython is implemented under the scenes. It's great that Python is high level, but I don't like treating it like a black box.

考虑到这一点,元组是如何实现的?我看过源(tupleobject.c),但它已经超出了我的脑海.

With that in mind, how are tuples implemented? I've had a look at the source (tupleobject.c), but it's going over my head.

我看到 PyTuple_MAXSAVESIZE = 20PyTuple_MAXFREELIST = 2000,什么是保存和空闲列表"?(长度为 20/21 或 2000/2001 的元组之间是否会有性能差异?什么强制最大元组长度?)

I see that PyTuple_MAXSAVESIZE = 20 and PyTuple_MAXFREELIST = 2000, what is saving and the "free list"? (Will there be a performance difference between tuples of length 20/21 or 2000/2001? What enforces the maximum tuple length?)

推荐答案

需要注意的是,此答​​案中的所有内容均基于我通过查看您链接的实现所收集到的信息.

As a caveat, everything in this answer is based on what I've gleaned from looking over the implementation you linked.

似乎元组的标准实现只是作为一个数组.但是,有很多优化可以加快速度.

It seems that the standard implementation of a tuple is simply as an array. However, there are a bunch of optimizations in place to speed things up.

首先,如果您尝试创建一个空元组,CPython 将返回一个代表空元组的规范对象.因此,它可以节省大量只分配单个对象的分配.

First, if you try to make an empty tuple, CPython instead will hand back a canonical object representing the empty tuple. As a result, it can save on a bunch of allocations that are just allocating a single object.

接下来,为了避免分配一堆小对象,CPython 会为许多小列表回收内存.有一个固定常量 (PyTuple_MAXSAVESIZE),这样所有小于此长度的元组都有资格回收它们的空间.每当一个长度小于这个常数的对象被释放时,与它相关的内存就有可能不会被释放,而是被存储在一个空闲列表"中.(更多关于下一段的内容)基于其大小.这样,如果您需要分配一个大小为 n 的元组,并且其中一个先前已分配且不再使用,则 CPython 可以回收旧数组.

Next, to avoid allocating a bunch of small objects, CPython recycles memory for many small lists. There is a fixed constant (PyTuple_MAXSAVESIZE) such that all tuples less than this length are eligible to have their space reclaimed. Whenever an object of length less than this constant is deallocated, there is a chance that the memory associated with it will not be freed and instead will be stored in a "free list" (more on that in the next paragraph) based on its size. That way, if you ever need to allocate a tuple of size n and one has previously been allocated and is no longer in use, CPython can just recycle the old array.

空闲列表本身被实现为一个大小为 PyTuple_MAXSAVESIZE 的数组,存储指向未使用元组的指针,其中数组的第 n 个元素指向 NULL(如果没有额外的大小为 n 的元组可用)或大小为 n 的回收元组.如果有多个大小为 n 的不同元组可以重用,则它们通过将每个元组的第零个入口点指向可以重用的下一个元组,以某种链接列表的形式链接在一起.(由于只分配了一个长度为零的元组,因此永远不会有读取不存在的第零个元素的风险).通过这种方式,分配器可以存储一定数量的每个大小的元组以供重用.为了确保这不会使用太多内存,还有第二个常量 PyTuple_MAXFREELIST 用于控制任何存储桶中任何这些链表的最大长度.然后是一个长度为 PyTuple_MAXSAVESIZE 的辅助数组,用于存储每个给定长度的元组的链表长度,以便不超过此上限.

The free list itself is implemented as an array of size PyTuple_MAXSAVESIZE storing pointers to unused tuples, where the nth element of the array points either to NULL (if no extra tuples of size n are available) or to a reclaimed tuple of size n. If there are multiple different tuples of size n that could be reused, they are chained together in a sort of linked list by having each tuple's zeroth entry point to the next tuple that can be reused. (Since there is only one tuple of length zero ever allocated, there is never a risk of reading a nonexistent zeroth element). In this way, the allocator can store some number of tuples of each size for reuse. To ensure that this doesn't use too much memory, there is a second constant PyTuple_MAXFREELIST that controls the maximum length of any of these linked lists within any bucket. There is then a secondary array of length PyTuple_MAXSAVESIZE that stores the length of the linked lists for tuples of each given length so that this upper limit isn't exceeded.

总而言之,这是一个非常聪明的实现!

All in all, it's a very clever implementation!

这篇关于元组是如何在 CPython 中实现的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆