为什么通过切片分配超过列表的末尾不会引发IndexError? [英] Why does assigning past the end of a list via a slice not raise an IndexError?

查看:119
本文介绍了为什么通过切片分配超过列表的末尾不会引发IndexError?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究稀疏列表实现,最近正在通过切片实现分配.这使我发现Python的内置list实现中的某些行为,该行为我感到惊讶.

I'm working on a sparse list implementation and recently implemented assignment via a slice. This led me to discover some behaviour in Python's built-in list implementation that I find suprising.

给出一个空的list并通过切片进行赋值:

Given an empty list and an assignment via a slice:

>>> l = []
>>> l[100:] = ['foo']

在这里我应该从list中得到一个IndexError,因为实现方式意味着无法从指定的索引中检索项目:

I would have expected an IndexError from list here because the way this is implemented means that an item can't be retrieved from the specified index::

>>> l[100]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

'foo'甚至无法从指定的片中检索:

'foo' cannot even be retrieved from the specified slice:

>>> l = []
>>> l[100:] = ['foo']
>>> l[100:]
[]

l[100:] = ['foo'] 追加list(即此分配后的l == ['foo']),并且由于

l[100:] = ['foo'] appends to the list (that is, l == ['foo'] after this assignment) and appears to have behaved this way since the BDFL's initial version. I can't find this functionality documented anywhere (*) but both CPython and PyPy behave this way.

按索引分配会引发错误:

Assigning by index raises an error:

>>> l[100] = 'bar'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range

那么为什么通过片分配超过list的末尾却不会引发IndexError(或者我猜是其他错误)?

So why does assigning past the end of a list via a slice not raise an IndexError (or some other error, I guess)?

为明确说明前两个评论,此问题专门针对分配,而不是检索( cf.

To clarify following the first two comments, this question is specifically about assignment, not retrieval (cf. Why substring slicing index out of range works in Python?).

当我明确指定索引100时,陷入诱惑去猜测'foo'并将'foo'分配给l 在索引0 Python.

Giving into the temptation to guess and assigning 'foo' to l at index 0 when I had explicitly specified index 100 doesn't follow the usual Zen of Python.

请考虑以下情况:赋值发生在远离初始化且索引为变量的情况下.呼叫者无法再从指定位置检索其数据.

Consider the case where the assignment happens far away from the initialisation and the index is a variable. The caller can no longer retrieve their data from the specified location.

之前分配切片list的行为与上面的示例有些不同:

Assigning to a slice before the end of a list behaves somewhat differently to the example above:

>>> l = [None, None, None, None]
>>> l[3:] = ['bar']
>>> l[3:]
['bar']


(*)在" rel ="nofollow noreferrer"> 5.6.官方文档中的序列类型(感谢 elethan ),但尚未解释 >为什么在分配作业时认为它是可取的.


(*) This behaviour is defined in Note 4 of 5.6. Sequence Types in the official documentation (thanks elethan) but it's not explained why it would be considered desirable on assignment.

注意:,我了解检索的工作原理,并且可以了解与分配相符的方式可能是可取的,但是我正在寻找一个引用原因,说明为什么分配给切片的行为会如此道路.如果您不了解len(l),特别是如果您遵循Python的

Note: I understand how retrieval works and can see how it may be desirable to be consistent with this for assignment but am looking for a cited reason as to why assigning to a slice would behave in this way. l[100:] returning [] immediately after l[100:] = ['foo'] but l[3:] returning ['bar'] after l[3:] = ['bar'] is astonishing if you have no knowledge of len(l), particularly if you're following Python's EAFP idiom.

推荐答案

让我们看看实际发生了什么:

Let's see what is actually happening:

>>> l = []
>>> l[100:] = ['foo']
>>> l[100:]
[]
>>> l
['foo']

所以分配实际上是成功的,并且该项目已作为第一项放入列表中.

So the assignment was actually successful, and the item got placed into the list, as the first item.

发生这种情况的原因是,索引位置中的100:转换为 slice 对象:slice(100, None, None):

Why this happens is because 100: in indexing position is converted to a slice object: slice(100, None, None):

>>> class Foo:
...     def __getitem__(self, i):
...         return i
... 
>>> Foo()[100:]
slice(100, None, None)

现在,slice类具有方法indices(不过,我无法在线找到其Python文档),当给定序列长度时,它将给出针对(start, stop, stride)进行调整的该序列的长度.

Now, the slice class has a method indices (I am not able to find its Python documentation online, though) that, when given a length of a sequence, will give (start, stop, stride) that is adjusted for the length of that sequence.

>>> slice(100, None, None).indices(0)
(0, 0, 1)

因此,将此切片应用于长度为0的序列时,其行为与切片检索的切片slice(0, 0, 1)完全相同,例如当foo是一个空序列时,它不会像foo[100:]一样抛出错误,它的行为就像请求了foo[0:0:1]一样-这将导致在检索时出现空片.

Thus when this slice is applied to a sequence of length 0, it behaves exactly like a slice slice(0, 0, 1) for slice retrievals, e.g. instead of foo[100:] throwing an error when foo is an empty sequence, it behaves as if foo[0:0:1] was requested - this will result on empty slice on retrieval.

现在,当l是具有多于100个元素的序列时,使用l[100:]时,setter代码应该可以正常工作.要使它在那里工作,最简单的方法是重新发明轮子,而只是使用上面的indices机制.不利的一面是,现在在边缘情况下看起来有些奇怪,但是对超出范围"的切片的切片分配将放置在当前序列的末尾. (但是,事实证明,CPython代码中很少重复使用代码; 偶尽管它也可以通过切片对象C-API )获得.

Now the setter code should work correctly when l[100:] was used when l is a sequence that has more than 100 elements. To make it work there, the easiest is to not reinvent the wheel, and to just use the indices mechanism above. As a downside, it will now look a bit peculiar in edge cases, but slice assignments to slices that are "out of bounds" will be placed at the end of the current sequence instead. (However, it turns out that there is little code reuse in the CPython code; list_ass_slice essentially duplicates all this index handling, even though it would also be available via slice object C-API).

因此:如果切片的起始索引大于或等于序列的长度,则所得切片的行为就好像是从序列末尾开始的零宽度切片一样.即:如果a >= len(l)l[a:]在内置类型上的行为类似于l[len(l):len(l)].对于分配,检索和删除中的每一个都是如此.

Thus: if start index of a slice is greater than or equal to the length of a sequence, the resulting slice behaves as if it is a zero-width slice starting from the end of the the sequence. I.e.: if a >= len(l), l[a:] behaves like l[len(l):len(l)] on built-in types. This is true for each of assignment, retrieval and deletion.

它的可取之处在于它不需要任何例外. slice.indices方法不需要处理任何异常-对于长度为l的序列,slice.indices(l)始终会生成(start, end, stride)的索引,该索引可用于任何赋值,检索和删除,并且保证startend均为0 <= v <= len(l).

The desirability of this is in that it doesn't need any exceptions. The slice.indices method doesn't need to handle any exceptions - for a sequence of length l, slice.indices(l) will always result in (start, end, stride) of indices that can be used for any of assignment, retrieval and deletion, and it is guaranteed that both start and end are 0 <= v <= len(l).

这篇关于为什么通过切片分配超过列表的末尾不会引发IndexError?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆