你可以补丁*只是*一个带有闭包的嵌套函数,还是必须重复整个外部函数? [英] Can you patch *just* a nested function with closure, or must the whole outer function be repeated?

查看:80
本文介绍了你可以补丁*只是*一个带有闭包的嵌套函数,还是必须重复整个外部函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用的第三方库包含一个相当长的函数,在其中使用嵌套函数。我们使用该库会在该函数中触发一个错误,我们非常想解决这个错误。

A 3rd party library we use contains a rather long function that uses a nested function inside it. Our use of that library triggers a bug in that function, and we very much would like to solve that bug.

不幸的是,库维护者有些缓慢的修复,但是我们不想分叉库。我们也不能持有我们的版本,直到他们解决了这个问题。

Unfortunately, the library maintainers are somewhat slow with fixes, but we don't want to have to fork the library. We also cannot hold our release until they have fixed the issue.

我们更喜欢使用猴子修补来解决这个问题,因为这更容易跟踪,资源。然而,重复一个非常大的函数,只是替换内部函数将足够感觉过度,使其他人更难看到我们改变了什么。我们是否遇到了对库函数的静态补丁?

We would prefer to use monkey-patching to fix this issue here as that is easier to track than patching the source. However, to repeat a very large function where just replacing the inner function would be enough feels overkill, and makes it harder for others to see what exactly we changed. Are we stuck with a static patch to the library egg?

内部函数依赖于关闭变量;一个假设的例子是:

The inner function relies on closing over a variable; a contrived example would be:

def outerfunction(*args):
    def innerfunction(val):
        return someformat.format(val)

    someformat = 'Foo: {}'
    for arg in args:
        yield innerfunction(arg)

其中我们只想替换 innerfunction()的实现。实际的外部函数远远更长。当然,我们可以重用闭包变量并保持函数签名。

where we would want to replace just the implementation of innerfunction(). The actual outer function is far, far longer. We'd reuse the closed-over variable and maintain the function signature, of course.

推荐答案

函数,即使它使用闭包。你必须跳过几个箍。请考虑:

Yes, you can replace an inner function, even if it is using a closure. You'll have to jump through a few hoops though. Please take into account:


  1. 您还需要将替换函数创建为嵌套函数,以确保Python创建相同关闭。如果原始函数对名称 foo bar 有闭包,则需要将替换定义为嵌套函数用相同的名字结束。更重要的是,您需要以相同的顺序使用这些名称​​ ;

  1. You need to create the replacement function as a nested function too, to ensure that Python creates the same closure. If the original function has a closure over the names foo and bar, you need to define your replacement as a nested function with the same names closed over. More importantly, you need to use those names in the same order; closures are referenced by index.

猴子补丁总是脆弱的,并且随着实现的改变而崩溃。这也不例外。

Monkey patching is always fragile and can break with the implementation changing. This is no exception. Retest your monkey patch whenever you change versions of the patched library.

若要了解这项功能的运作方式,首先解释Python如何处理嵌套函数。 Python使用代码对象根据需要生成函数对象。每个代码对象都有一个相关的常量列表,嵌套函数的代码对象存储在该列表中:

To understand how this will work, I'll first explain how Python handles nested functions. Python uses code objects to produce function objects as needed. Each code object has an associated constants list, and the code objects for nested functions are stored in that list:

>>> def outerfunction(*args):
...     def innerfunction(val):
...         return someformat.format(val)
...     someformat = 'Foo: {}'
...     for arg in args:
...         yield innerfunction(arg)
... 
>>> outerfunction.__code__
<code object outerfunction at 0x105b27ab0, file "<stdin>", line 1>
>>> outerfunction.__code__.co_consts
(None, <code object innerfunction at 0x100769db0, file "<stdin>", line 2>, 'Foo: {}')

co_consts 序列是一个元组,因此我们不能仅仅交换内部代码对象。稍后我将演示如何使用 替换该代码对象来生成一个新的函数对象。

The co_consts sequence is a tuple, so we cannot just swap out the inner code object. I'll show later on how we'll produce a new function object with just that code object replaced.

接下来,闭合。在编译时,Python确定a) someformat 不是 innerfunction 中的本地名称,b)它正在关闭在 outerfunction 中使用相同的名称。 Python不仅生成字节码以产生正确的名称查找,而且嵌套和外部函数的代码对象都被注释以记录 someformat 要关闭:

Next, we need to cover closures. At compile time, Python determines that a) someformat is not a local name in innerfunction and that b) it is closing over the same name in outerfunction. Python not only then generates the bytecode to produce the correct name lookups, the code objects for both the nested and the outer functions are annotated to record that someformat is to be closed over:

>>> outerfunction.__code__.co_cellvars
('someformat',)
>>> outerfunction.__code__.co_consts[1].co_freevars
('someformat',)

想要确保替换内部代码对象只列出那些相同的名称作为自由变量,并按相同的顺序。

You want to make sure that the replacement inner code object only ever lists those same names as free variables, and does so in the same order.

闭包是在运行时创建的;生成它们的字节码是外部函数的一部分:

Closures are created at run-time; the byte-code to produce them is part of the outer function:

>>> import dis
>>> dis.dis(outerfunction)
2           0 LOAD_CLOSURE             0 (someformat)
            3 BUILD_TUPLE              1
            6 LOAD_CONST               1 (<code object innerfunction at 0x1047b2a30, file "<stdin>", line 2>)
            9 MAKE_CLOSURE             0
           12 STORE_FAST               1 (innerfunction)

# ... rest of disassembly omitted ...

LOAD_CLOSURE 字节码会为 someformat variable; Python按照函数按照它们在内部函数中首次使用的顺序创建尽可能多的闭包。这是一个重要的事实要记住以后。函数本身按位置查找这些闭包:

The LOAD_CLOSURE bytecode there creates a closure for the someformat variable; Python creates as many closures as used by the function in the order they are first used in the inner function. This is an important fact to remember for later. The function itself looks up these closures by position:

>>> dis.dis(outerfunction.__code__.co_consts[1])
  3           0 LOAD_DEREF               0 (someformat)
              3 LOAD_ATTR                0 (format)
              6 LOAD_FAST                0 (val)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE        

LOAD_DEREF opcode选择位置 0 处的闭包,以访问 someformat 闭包。

The LOAD_DEREF opcode picked the closure at position 0 here to gain access to the someformat closure.

理论上,这也意味着你可以在内部函数中为闭包使用完全不同的名称,但是出于调试的目的,坚持使用相同的名称更有意义。它还使得验证替换函数将容易插入,因为你可以比较 co_freevars 元组如果使用相同的名称。

In theory this also means you can use entirely different names for the closures in your inner function, but for debugging purposes it makes much more sense to stick to the same names. It also makes verifying that the replacement function will slot in properly easier, as you can just compare the co_freevars tuples if you use the same names.

现在的交换技巧。函数是Python中的任何其他对象,特定类型的实例。类型不会正常显示,但是 type()调用仍然会返回它。这同样适用于代码对象,两种类型都有文档:

Now for the swapping trick. Functions are objects like any other in Python, instances of a specific type. The type isn't exposed normally, but the type() call still returns it. The same applies to code objects, and both types even have documentation:

>>> type(outerfunction)
<type 'function'>
>>> print type(outerfunction).__doc__
function(code, globals[, name[, argdefs[, closure]]])

Create a function object from a code object and a dictionary.
The optional name string overrides the name from the code object.
The optional argdefs tuple specifies the default argument values.
The optional closure tuple supplies the bindings for free variables.
>>> type(outerfunction.__code__)
<type 'code'>
>>> print type(outerfunction.__code__).__doc__
code(argcount, nlocals, stacksize, flags, codestring, constants, names,
      varnames, filename, name, firstlineno, lnotab[, freevars[, cellvars]])

Create a code object.  Not for the faint of heart.

我们将使用这些类型对象产生一个新的代码对象与更新的常量,然后一个新的函数对象与更新的代码对象:

We'll use these type objects to produce a new code object with updated constants, and then a new function object with updated code object:

def replace_inner_function(outer, new_inner):
    """Replace a nested function code object used by outer with new_inner

    The replacement new_inner must use the same name and must at most use the
    same closures as the original.

    """
    if hasattr(new_inner, '__code__'):
        # support both functions and code objects
        new_inner = new_inner.__code__

    # find original code object so we can validate the closures match
    ocode = outer.__code__
    function, code = type(outer), type(ocode)
    iname = new_inner.co_name
    orig_inner = next(
        const for const in ocode.co_consts
        if isinstance(const, code) and const.co_name == iname)
    # you can ignore later closures, but since they are matched by position
    # the new sequence must match the start of the old.
    assert (orig_inner.co_freevars[:len(new_inner.co_freevars)] ==
            new_inner.co_freevars), 'New closures must match originals'
    # replace the code object for the inner function
    new_consts = tuple(
        new_inner if const is orig_inner else const
        for const in outer.__code__.co_consts)

    # create a new function object with the new constants
    return function(
        code(ocode.co_argcount, ocode.co_nlocals, ocode.co_stacksize,
             ocode.co_flags, ocode.co_code, new_consts, ocode.co_names,
             ocode.co_varnames, ocode.co_filename, ocode.co_name,
             ocode.co_firstlineno, ocode.co_lnotab, ocode.co_freevars,
             ocode.co_cellvars),
        outer.__globals__, outer.__name__, outer.__defaults__,
        outer.__closure__)

上述函数验证新的内部函数作为代码对象或作为函数)将确实使用与原始相同的闭包。然后创建新的代码和函数对象以匹配旧的外部函数对象,但嵌套的函数(按名称定位)替换为你的猴子补丁。

The above function validates that the new inner function (which can be passed in as either a code object or as a function) will indeed use the same closures as the original. It then creates new code and function objects to match the old outer function object, but with the nested function (located by name) replaced with your monkey patch.

为了证明上述一切正常,我们使用将每个格式化的值递增2的 innerfunction 代替:

To demonstrate that the above all works, lets replace innerfunction with one that increments each formatted value by 2:

>>> def create_inner():
...     someformat = None  # the actual value doesn't matter
...     def innerfunction(val):
...         return someformat.format(val + 2)
...     return innerfunction
... 
>>> new_inner = create_inner()

新的内部函数也创建为嵌套函数;这很重要,因为它确保Python将使用正确的字节码来查找 someformat 闭包。我使用 return 语句来提取函数对象,但是你也可以查看 create_inner._co_consts 来获取代码对象。

The new inner function is created as a nested function too; this is important as it ensures that Python will use the correct bytecode to look up the someformat closure. I used a return statement to extract the function object, but you could also look at create_inner._co_consts to grab the code object.

现在我们可以修补原来的外部函数,换出内部函数:

Now we can patch the original outer function, swapping out just the inner function:

>>> new_outer = replace_inner_function(outerfunction, new_inner)
>>> list(outerfunction(6, 7, 8))
['Foo: 6', 'Foo: 7', 'Foo: 8']
>>> list(new_outer(6, 7, 8))
['Foo: 8', 'Foo: 9', 'Foo: 10']

原始函数回显了原始值,但新返回的值增加了2。

The original function echoed out the original values, but the new returned values incremented by 2.

使用更少关闭的新替换内部函数:

You can even create new replacement inner functions that use fewer closures:

>>> def demo_outer():
...     closure1 = 'foo'
...     closure2 = 'bar'
...     def demo_inner():
...         print closure1, closure2
...     demo_inner()
... 
>>> def create_demo_inner():
...     closure1 = None
...     def demo_inner():
...         print closure1
... 
>>> replace_inner_function(demo_outer, create_demo_inner.__code__.co_consts[1])()
foo

,以完成图片:


  1. 使用相同的闭包创建您的monkey-patch内部函数作为嵌套函数

  2. 使用 replace_inner_function()产生外部函数

  3. 函数使用第2步中生成的新外层函数。

  1. Create your monkey-patch inner function as a nested function with the same closures
  2. Use replace_inner_function() to produce a new outer function
  3. Monkey patch the original outer function to use the new outer function produced in step 2.

这篇关于你可以补丁*只是*一个带有闭包的嵌套函数,还是必须重复整个外部函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆