用 io.TextIOWrapper 包装一个开放的流 [英] Wrap an open stream with io.TextIOWrapper

查看:29
本文介绍了用 io.TextIOWrapper 包装一个开放的流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何包装一个开放的二进制流——一个 Python 2 file、一个 Python 3 io.BufferedReader、一个 io.BytesIO——在 io.TextIOWrapper?

How can I wrap an open binary stream – a Python 2 file, a Python 3 io.BufferedReader, an io.BytesIO – in an io.TextIOWrapper?

我正在尝试编写可以保持不变的代码:

I'm trying to write code that will work unchanged:

  • 在 Python 2 上运行.
  • 在 Python 3 上运行.
  • 使用从标准库生成的二进制流(即我无法控制它们的类型)
  • 将二进制流设为测试替身(即没有文件句柄,无法重新打开).
  • 生成包装指定流的 io.TextIOWrapper.

io.TextIOWrapper 是必需的,因为标准库的其他部分需要它的 API.存在其他类似文件的类型,但不提供正确的 API.

The io.TextIOWrapper is needed because its API is expected by other parts of the standard library. Other file-like types exist, but don't provide the right API.

包装以 subprocess.Popen.stdout 属性表示的二进制流:

Wrapping the binary stream presented as the subprocess.Popen.stdout attribute:

import subprocess
import io

gnupg_subprocess = subprocess.Popen(
        ["gpg", "--version"], stdout=subprocess.PIPE)
gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8")

在单元测试中,流被替换为 io.BytesIO 实例以控制其内容,而无需触及任何子进程或文件系统.

In unit tests, the stream is replaced with an io.BytesIO instance to control its content without touching any subprocesses or filesystems.

gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8"))

这适用于 Python 3 标准库创建的流.但是,相同的代码在 Python 2 生成的流上失败:

That works fine on the streams created by Python 3's standard library. The same code, though, fails on streams generated by Python 2:

[Python 2]
>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'file' object has no attribute 'readable'

不是解决方案:对file

的特殊处理

一个明显的反应是在代码中有一个分支来测试流是否实际上是一个 Python 2 file 对象,并以不同于 io.* 对象的方式处理它.

Not a solution: Special treatment for file

An obvious response is to have a branch in the code which tests whether the stream actually is a Python 2 file object, and handle that differently from io.* objects.

对于经过良好测试的代码来说,这不是一个选项,因为它创建了一个用于单元测试的分支——为了尽可能快地运行,不能创建任何真实文件系统对象——可以'锻炼.

That's not an option for well-tested code, because it makes a branch that unit tests – which, in order to run as fast as possible, must not create any real filesystem objects – can't exercise.

单元测试将提供测试替身,而不是真正的 file 对象.因此,创建一个不会被那些测试替身执行的分支是在打败测试套件.

The unit tests will be providing test doubles, not real file objects. So creating a branch which won't be exercised by those test doubles is defeating the test suite.

一些受访者建议重新打开(例如使用 io.open)底层文件句柄:

Some respondents suggest re-opening (e.g. with io.open) the underlying file handle:

gnupg_stdout = io.open(
        gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")

这适用于 Python 3 和 Python 2:

That works on both Python 3 and Python 2:

[Python 3]
>>> type(gnupg_subprocess.stdout)
<class '_io.BufferedReader'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
>>> type(gnupg_stdout)
<class '_io.TextIOWrapper'>

[Python 2]
>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
>>> type(gnupg_stdout)
<type '_io.TextIOWrapper'>

当然,它依赖于从文件句柄重新打开真实文件.因此,当测试替身是 io.BytesIO 实例时,它在单元测试中失败:

But of course it relies on re-opening a real file from its file handle. So it fails in unit tests when the test double is an io.BytesIO instance:

>>> gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8"))
>>> type(gnupg_subprocess.stdout)
<type '_io.BytesIO'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
io.UnsupportedOperation: fileno

不是解决方案:codecs.getreader

标准库也有 codecs 模块,它提供了封装特性:

Not a solution: codecs.getreader

The standard library also has the codecs module, which provides wrapper features:

import codecs

gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout)

这很好,因为它不会尝试重新打开流.但是它没有提供 io.TextIOWrapper API.具体来说,它不继承io.IOBase并且没有encoding属性:

That's good because it doesn't attempt to re-open the stream. But it fails to provide the io.TextIOWrapper API. Specifically, it doesn't inherit io.IOBase and doesn't have the encoding attribute:

>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout)
>>> type(gnupg_stdout)
<type 'instance'>
>>> isinstance(gnupg_stdout, io.IOBase)
False
>>> gnupg_stdout.encoding
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/codecs.py", line 643, in __getattr__
    return getattr(self.stream, name)
AttributeError: '_io.BytesIO' object has no attribute 'encoding'

所以 codecs 不提供替代 io.TextIOWrapper 的对象.

So codecs doesn't provide objects which substitute for io.TextIOWrapper.

那么我如何编写既适用于 Python 2 又适用于 Python 3 的代码,包括测试替身和真实对象,这些对象在已经打开的对象周围包裹了一个 io.TextIOWrapper字节流?

So how can I write code that works for both Python 2 and Python 3, with both the test doubles and the real objects, which wraps an io.TextIOWrapper around the already-open byte stream?

推荐答案

根据各个论坛的多个建议,并尝试使用标准库来满足标准,我目前的结论是这不能做 使用我们目前拥有的库和类型.

Based on multiple suggestions in various forums, and experimenting with the standard library to meet the criteria, my current conclusion is this can't be done with the library and types as we currently have them.

这篇关于用 io.TextIOWrapper 包装一个开放的流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆