参数解包使用迭代还是项目获取? [英] Does argument unpacking use iteration or item-getting?

查看:23
本文介绍了参数解包使用迭代还是项目获取?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Python 2.7.3.

考虑一个具有自定义(虽然不好)迭代和物品获取行为的虚拟类:

class FooList(list):def __iter__(self):返回迭代器(自我)定义下一个(自己):返回 3def __getitem__(self, idx):返回 3

举个例子,看看奇怪的行为:

<预><代码>>>>zz = FooList([1,2,3])>>>[x 代表 x 中的 x]# 由于 `__iter__` 中的自引用而挂起.>>>zz[0]3>>>zz[1]3

但是现在,让我们创建一个函数,然后对 zz 进行参数解包:

def add3(a, b, c):返回 a + b + c>>>add3(*zz)6#我期望9或解释器像理解一样挂!

因此,参数解包是以某种方式从 zz 获取项目数据,但不是通过使用其实现的迭代器迭代对象,也不是通过执行穷人的迭代器并调用 __getitem__ 用于与对象拥有的项目一样多的项目.

那么问题来了:如果不是通过这些方法,语法add3(*zz)如何获取zz的数据成员?我是否只是错过了从此类类型获取数据成员的另一种常见模式?

我的目标是看看我是否可以编写一个实现迭代或项目获取的类,以改变参数解包语法对该类的含义.在尝试了上面的两个示例之后,我现在想知道参数解包如何获取底层数据以及程序员是否可以影响该行为.为此,Google 仅给出了大量解释 *args 语法的基本用法的结果.

我没有需要这样做的用例,我并不是说这是一个好主意.出于好奇,我只是想看看怎么做.

已添加

由于内置类型被特殊处理,这里有一个 object 的例子,我只维护一个列表对象并实现我自己的 get 和 set 行为来模拟列表.

class FooList(object):def __init__(self, lst):self.lst = lstdef __iter__(self): 引发 ValueErrordef next(self): 返回 3def __getitem__(self, idx): 返回 self.lst.__getitem__(idx)def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)

在这种情况下,

在[234]中:zz = FooList([1,2,3])在 [235] 中:[x 中的 x 中的 x]---------------------------------------------------------------------------ValueError 回溯(最近一次调用)<ipython-input-235-ad3bb7659c84>在 <module>()---->1 [x 代表 x 中的 x]<ipython-input-233-dc9284300db1>在 __iter__(self) 中2 def __init__(self, lst):3 self.lst = lst---->4 def __iter__(self): 引发 ValueError5 def next(self): 返回 36 def __getitem__(self, idx): return self.lst.__getitem__(idx)值错误:在 [236] 中:add_3(*zz)---------------------------------------------------------------------------ValueError 回溯(最近一次调用)<ipython-input-236-f9bbfdc2de5c>在 <module>()---->1 add_3(*zz)<ipython-input-233-dc9284300db1>在 __iter__(self) 中2 def __init__(self, lst):3 self.lst = lst---->4 def __iter__(self): 引发 ValueError5 def next(self): 返回 36 def __getitem__(self, idx): return self.lst.__getitem__(idx)值错误:

但相反,如果我确保迭代停止并始终返回 3,那么我可以在第一种情况下获得我正在拍摄的内容:

class FooList(object):def __init__(self, lst):self.lst = lstself.iter_loc = -1def __iter__(self): 返回自我定义下一个(自己):如果 self.iter_loc 

然后我看到了这个,这正是我最初预期的:

在[247]中:zz = FooList([1,2,3])在 [248] 中:ix = iter(zz)在 [249]: ix.next()出[249]:3在 [250]: ix.next()出[250]:3在 [251]: ix.next()出[251]:3在 [252]: ix.next()---------------------------------------------------------------------------StopIteration Traceback(最近一次调用最后一次)<ipython-input-252-29d4ae900c28>在 <module>()---->1 ix.next()<ipython-input-246-5479fdc9217b>在下一个(自己)10 其他:11 self.iter_loc = -1--->12 引发停止迭代13 def __getitem__(self, idx): return self.lst.__getitem__(idx)14 def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)停止迭代:在 [253] 中:ix = iter(zz)在 [254]: ix.next()出[254]:3在 [255]: ix.next()出[255]:3在 [256]: ix.next()出[256]:3在 [257]: ix.next()---------------------------------------------------------------------------StopIteration Traceback(最近一次调用最后一次)<ipython-input-257-29d4ae900c28>在 <module>()---->1 ix.next()<ipython-input-246-5479fdc9217b>在下一个(自己)10 其他:11 self.iter_loc = -1--->12 引发停止迭代13 def __getitem__(self, idx): return self.lst.__getitem__(idx)14 def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)停止迭代:在 [258] 中:add_3(*zz)出[258]:9在 [259] 中:zz[0]出[259]:1在 [260] 中:zz[1]出[260]:2在 [261] 中:zz[2]出[261]:3在 [262] 中:[x 中的 x]出[262]:[3, 3, 3]

总结

  1. 语法 *args 仅依赖于迭代.对于内置类型,这种情况在继承自内置类型的类中无法直接覆盖.

  2. 这两个在功能上是等价的:

    foo(*[x for x in args])

    foo(*args)

  3. 即使对于有限数据结构,这些也不是等价的.

    foo(*args)

    foo(*[args[i] for i in range(len(args))])

解决方案

您已经被 Python 最恼人的问题之一所困扰:内置类型及其子类在某些地方被神奇地对待.

由于您的类型是 list 的子类,Python 会神奇地进入其内部来解压它.它根本不使用真正的迭代器 API.如果您在 next__getitem__ 中插入 print 语句,您将看到两个都没有被调用.此行为不能被覆盖;相反,您必须编写自己的类来重新实现内置类型.您可以尝试使用 UserList;我还没有检查这是否有效.

您的问题的答案是参数解包使用迭代.但是,如果没有明确定义__iter__,迭代本身可以使用__getitem__.您不能创建一个定义与正常迭代行为不同的参数解包行为的类.

迭代器协议(基本上是__iter__ 的工作原理")不应该被假定为适用于像 list 这样的内置类型的子类.如果您对内置函数进行子类化,则在某些情况下,您的子类可能会像底层内置函数一样神奇地表现,而无需使用自定义的魔术方法(如 __iter__).如果您想完全可靠地自定义行为,则不能从内置类型(当然,object 除外)进行子类化.

I'm using Python 2.7.3.

Consider a dummy class with custom (albeit bad) iteration and item-getting behavior:

class FooList(list):
    def __iter__(self):
        return iter(self)
    def next(self):
        return 3
    def __getitem__(self, idx):
        return 3

Make an example and see the weird behavior:

>>> zz = FooList([1,2,3])

>>> [x for x in zz]
# Hangs because of the self-reference in `__iter__`.

>>> zz[0]
3

>>> zz[1]
3

But now, let's make a function and then do argument unpacking on zz:

def add3(a, b, c):
    return a + b + c

>>> add3(*zz)
6
# I expected either 9 or for the interpreter to hang like the comprehension!

So, argument unpacking is somehow getting the item data from zz but not by either iterating over the object with its implemented iterator and also not by doing a poor man's iterator and calling __getitem__ for as many items as the object has.

So the question is: how does the syntax add3(*zz) acquire the data members of zz if not by these methods? Am I just missing one other common pattern for getting data members from a type like this?

My goal is to see if I could write a class that implements iteration or item-getting in such a way that it changes what the argument unpacking syntax means for that class. After trying the two example above, I'm now wondering how argument unpacking gets at the underlying data and whether the programmer can influence that behavior. Google for this only gave back a sea of results explaining the basic usage of the *args syntax.

I don't have a use case for needing to do this and I am not claiming it is a good idea. I just want to see how to do it for the sake of curiosity.

Added

Since the built-in types are treated specially, here's an example with object where I just maintain a list object and implement my own get and set behavior to emulate list.

class FooList(object):
    def __init__(self, lst):
        self.lst = lst
    def __iter__(self): raise ValueError
    def next(self): return 3
    def __getitem__(self, idx): return self.lst.__getitem__(idx)
    def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)

In this case,

In [234]: zz = FooList([1,2,3])

In [235]: [x for x in zz]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-235-ad3bb7659c84> in <module>()
----> 1 [x for x in zz]

<ipython-input-233-dc9284300db1> in __iter__(self)
      2     def __init__(self, lst):
      3         self.lst = lst
----> 4     def __iter__(self): raise ValueError
      5     def next(self): return 3
      6     def __getitem__(self, idx): return self.lst.__getitem__(idx)

ValueError:

In [236]: add_3(*zz)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-236-f9bbfdc2de5c> in <module>()
----> 1 add_3(*zz)

<ipython-input-233-dc9284300db1> in __iter__(self)
      2     def __init__(self, lst):
      3         self.lst = lst
----> 4     def __iter__(self): raise ValueError
      5     def next(self): return 3
      6     def __getitem__(self, idx): return self.lst.__getitem__(idx)

ValueError:

But instead, if I ensure iteration stops and always returns 3, I can get what I was shooting to play around with in the first case:

class FooList(object):
    def __init__(self, lst):
        self.lst = lst
        self.iter_loc = -1
    def __iter__(self): return self
    def next(self): 
        if self.iter_loc < len(self.lst)-1:
            self.iter_loc += 1
            return 3
        else:
            self.iter_loc = -1
            raise StopIteration
    def __getitem__(self, idx): return self.lst.__getitem__(idx)
    def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)

Then I see this, which is what I originally expected:

In [247]: zz = FooList([1,2,3])

In [248]: ix = iter(zz)

In [249]: ix.next()
Out[249]: 3

In [250]: ix.next()
Out[250]: 3

In [251]: ix.next()
Out[251]: 3

In [252]: ix.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-252-29d4ae900c28> in <module>()
----> 1 ix.next()

<ipython-input-246-5479fdc9217b> in next(self)
     10         else:
     11             self.iter_loc = -1
---> 12             raise StopIteration
     13     def __getitem__(self, idx): return self.lst.__getitem__(idx)
     14     def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)

StopIteration:

In [253]: ix = iter(zz)

In [254]: ix.next()
Out[254]: 3

In [255]: ix.next()
Out[255]: 3

In [256]: ix.next()
Out[256]: 3

In [257]: ix.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-257-29d4ae900c28> in <module>()
----> 1 ix.next()

<ipython-input-246-5479fdc9217b> in next(self)
     10         else:
     11             self.iter_loc = -1
---> 12             raise StopIteration
     13     def __getitem__(self, idx): return self.lst.__getitem__(idx)
     14     def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)

StopIteration:

In [258]: add_3(*zz)
Out[258]: 9

In [259]: zz[0]
Out[259]: 1

In [260]: zz[1]
Out[260]: 2

In [261]: zz[2]
Out[261]: 3

In [262]: [x for x in zz]
Out[262]: [3, 3, 3]

Summary

  1. The syntax *args relies on iteration only. For built-in types this happens in a way that is not directly overrideable in classes that inherit from the built-in type.

  2. These two are functionally equivalent:

    foo(*[x for x in args])

    foo(*args)

  3. These are not equivalent even for finite data structures.

    foo(*args)

    foo(*[args[i] for i in range(len(args))])

解决方案

You have been bitten by one of Python's most irritating warts: builtin types and subclasses of them are treated magically in some places.

Since your type subclasses from list, Python magically reaches into its internals to unpack it. It doesn't use the real iterator API at all. If you insert print statements inside your next and __getitem__, you'll see that neither one is being called. This behavior cannot be overridden; instead, you would have to write your own class that reimplements the builtin types. You could try using UserList; I haven't checked whether that would work.

The answer to your question is that argument unpacking uses iteration. However, iteration itself can use __getitem__ if there is no explicit __iter__ defined. You can't make a class that defines argument-unpacking behavior that is different from the normal iteration behavior.

The iterator protocol (basically "how __iter__ works") shouldn't be assumed to apply to types that subclass builtin types like list. If you subclass a builtin, your subclass may magically behave like the underlying builtin in certain situations, without making use of your customize magic methods (like __iter__). If you want to customize behavior fully and reliably, you can't subclass from builtin types (except, of course, object).

这篇关于参数解包使用迭代还是项目获取?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆