防止TextIOWrapper以兼容Py2 / Py3的方式关闭GC [英] Prevent TextIOWrapper from closing on GC in a Py2/Py3 compatible way

查看:97
本文介绍了防止TextIOWrapper以兼容Py2 / Py3的方式关闭GC的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要完成的是:

给定一个二进制文件,以几种不同的方式对它进行解码,提供 TextIOBase

code> API。理想情况下,这些后续文件可以传递,而不需要明确地追踪它们的寿命。

不幸的是,包装一个 BufferedReader 会使
导致读者在 TextIOWrapper 超出范围。



以下是一个简单的演示:

  In [1 ]:import io 

在[2]中:def mangle(x):
...:io.TextIOWrapper(x)#将得到GCed,导致__del__调用close
...:

在[3]中:f = io.open('example',mode ='rb')

在[4]中:f.closed $在[5]中:mangle(f)

在[6]中:f.closed
Out [6] b $ b Out [4]:False

:True

我可以通过覆盖 __ del __ (这对我的用例来说是一个合理的解决方案,因为我完全控制了解码过程,我只需要在最后公开一个非常统一的API):

  In [1]:import io 

In [2]:class MyTextIOWrapper(io.TextIOWrapper):
...:def __del__ (self):
...:print(我已经被GC'ed了)
...:

在[3]中:def mangle2(x) :
...:MyTextIOWrapper(x)
...:

在[4]中:f2 = io.open(例如',mode ='rb')

In [5]:f2.closed
Out [5]:False

In [6]:mangle2 f2)
我已经GC'ed

在[7]中:f2.closed
Out [7]:False



然而,这在Python 2中不起作用:

 在[7]中:class MyTextIOWrapper(io.TextIOWrapper):
...:def __del __(self):
...:print(我已经被GC'ed了)
...:

在[8]中:def mangle2(x):
...:MyTextIOWrapper(x)
...:

在[9]中:f2 = io.open('example',mode ='rb')

在[10]中:f2.closed
Out [10]:False

在[11]中:mangle2(f2)
我已经GC'ed

在[12]中:f2.closed
Out [ 12]:True

我花了一段时间盯着Python源代码,它看起来像在2.7和3.4之间非常相似,所以我不明白为什么从 IOBase 继承的 __ del __ 不可覆盖n Python 2(甚至在 dir 中可见),但似乎仍然被执行。 Python 3完全按照预期工作。



有什么我可以做的吗?

解决方案

在Python 2.7中调用 close 的解构器基本上没有什么可以做的。这被硬编码到C代码中。相反,我们可以修改 close ,使得它在 __ del __ 发生时不会关闭缓冲区( __ del __ 将在C代码中的 _PyIOBase_finalize 之前执行,这使我们有机会改变 close )。这使得关闭不会让GC关闭缓冲区而按预期工作。

  class SaneTextIOWrapper(io.TextIOWrapper):
def __init __(self,* args,** kwargs):
self._should_close_buffer = True
super(SaneTextIOWrapper,self).__ init __(* args ,** kwargs)

def __del __(self):
#由于Python 2.7中的这一行,接受由析构函数
#关闭的缓冲区的必然性:
#https://github.com/python/cpython/blob/2.7/Modules/_io/iobase.c#L221
self._should_close_buffer = False
self.close()#其实接近Python 3,因为它是一个覆盖。
#我们不能调用超类,因为Python 2实际上并不
#有一个用于IOBase的`__del__`方法(因此这个
#解决方法)。关闭是幂等性的,所以Python 2将最终调用两次
#并不重要

def close(self):
#我们无法阻止Python 2.7在解构器
#中调用close,所以我们可以防止缓冲区被一个标志关闭。

#基于:
#https://github.com/python/cpython/blob/2.7/Lib/_pyio.py#L1586
#https:// github如果self.buffer不是None并且不是self.closed:
try:
self.flush()/ bpy / python / cpython / blob / 3.4 / Lib / _pyio.py#L1615

finally:
if self._should_close_buffer:
self.buffer.close()

我以前的解决方案使用的是 _pyio.TextIOWrapper ,它比上面的要慢,因为它是用Python编写的,而不是C.



它包含了一个简单的覆盖 __ del __ 的noop,它也可以在Py2 / 3中使用。


What I need to accomplish:

Given a binary file, decode it in a couple different ways providing a TextIOBase API. Ideally these subsequent files can get passed on without my needing to keep track of their lifespan explicitly.

Unfortunately, wrapping a BufferedReader will result in that reader being closed when the TextIOWrapper goes out of scope.

Here is a simple demo of this:

In [1]: import io

In [2]: def mangle(x):
   ...:     io.TextIOWrapper(x) # Will get GCed causing __del__ to call close
   ...:     

In [3]: f = io.open('example', mode='rb')

In [4]: f.closed
Out[4]: False

In [5]: mangle(f)

In [6]: f.closed
Out[6]: True

I can fix this in Python 3 by overriding __del__ (this is a reasonable solution for my use case as I have complete control over the decoding process, I just need to expose a very uniform API at the end):

In [1]: import io

In [2]: class MyTextIOWrapper(io.TextIOWrapper):
   ...:     def __del__(self):
   ...:         print("I've been GC'ed")
   ...:         

In [3]: def mangle2(x):
   ...:     MyTextIOWrapper(x)
   ...:     

In [4]: f2 = io.open('example', mode='rb')

In [5]: f2.closed
Out[5]: False

In [6]: mangle2(f2)
I've been GC'ed

In [7]: f2.closed
Out[7]: False

However this does not work in Python 2:

In [7]: class MyTextIOWrapper(io.TextIOWrapper):
   ...:     def __del__(self):
   ...:         print("I've been GC'ed")
   ...:         

In [8]: def mangle2(x):
   ...:     MyTextIOWrapper(x)
   ...:     

In [9]: f2 = io.open('example', mode='rb')

In [10]: f2.closed
Out[10]: False

In [11]: mangle2(f2)
I've been GC'ed

In [12]: f2.closed
Out[12]: True

I've spent a bit of time staring at the Python source code and it looks remarkably similar between 2.7 and 3.4 so I don't understand why the __del__ inherited from IOBase is not overridable in Python 2 (or even visible in dir), but still seems to get executed. Python 3 works exactly as expected.

Is there anything I can do?

解决方案

It turns out there is basically nothing that can be done about the deconstructor calling close in Python 2.7. This is hardcoded into the C code. Instead we can modify close such that it won't close the buffer when __del__ is happening (__del__ will be executed before _PyIOBase_finalize in the C code giving us a chance to change the behaviour of close). This lets close work as expected without letting the GC close the buffer.

class SaneTextIOWrapper(io.TextIOWrapper):
    def __init__(self, *args, **kwargs):
        self._should_close_buffer = True
        super(SaneTextIOWrapper, self).__init__(*args, **kwargs)

    def __del__(self):
        # Accept the inevitability of the buffer being closed by the destructor
        # because of this line in Python 2.7:
        # https://github.com/python/cpython/blob/2.7/Modules/_io/iobase.c#L221
        self._should_close_buffer = False
        self.close()  # Actually close for Python 3 because it is an override.
                      # We can't call super because Python 2 doesn't actually
                      # have a `__del__` method for IOBase (hence this
                      # workaround). Close is idempotent so it won't matter
                      # that Python 2 will end up calling this twice

    def close(self):
        # We can't stop Python 2.7 from calling close in the deconstructor
        # so instead we can prevent the buffer from being closed with a flag.

        # Based on:
        # https://github.com/python/cpython/blob/2.7/Lib/_pyio.py#L1586
        # https://github.com/python/cpython/blob/3.4/Lib/_pyio.py#L1615
        if self.buffer is not None and not self.closed:
            try:
                self.flush()
            finally:
                if self._should_close_buffer:
                    self.buffer.close()

My previous solution here used _pyio.TextIOWrapper which is slower than the above because it is written in Python, not C.

It involved simply overriding __del__ with a noop which will also work in Py2/3.

这篇关于防止TextIOWrapper以兼容Py2 / Py3的方式关闭GC的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆