如何在Python 3中将文本流编码为字节流？ [英] How to encode a text stream into a byte stream in Python 3?

查看：274 发布时间：2020/10/1 0:31:09 python python-3.x io character-encoding stream

本文介绍了如何在Python 3中将文本流编码为字节流？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

将字节流解码为文本流很容易：

Decoding a byte stream into a text stream is easy:

import io
f = io.TextIOWrapper(io.BytesIO(b'Test\nTest\n'), 'utf-8')
f.readline()

在此示例中， io.BytesIO（b'Test\nTest\n'）是字节流，而 f 是文本流。

In this example, io.BytesIO(b'Test\nTest\n') is a byte stream and f is a text stream.

我想做相反的事情。给定文本流或类似文件的对象，我想将其编码为字节流或类似文件的对象 ，而不处理整个流 。

I want to do exactly the opposite of that. Given a text stream or file-like object, I would like to encode it into a byte stream or file-like object without processing the entire stream.

这是我到目前为止尝试过的：

This is what I've tried so far:

import io, codecs

f = codecs.getreader('utf-8')(io.StringIO('Test\nTest\n'))
f.readline()
# TypeError: can't concat str to bytes

f = codecs.EncodedFile(io.StringIO('Test\nTest\n'), 'utf-8')
f.readline()
# TypeError: can't concat str to bytes

f = codecs.StreamRecoder(io.StringIO('Test\nTest\n'), None, None,
                         codecs.getreader('utf-8'), codecs.getwriter('utf-8'))
# TypeError: can't concat str to bytes

f = codecs.encode(io.StringIO('Test\nTest\n'), 'utf-8')
# TypeError: utf_8_encode() argument 1 must be str, not _io.StringIO

f = io.TextIOWrapper(io.StringIO('Test\nTest\n'), 'utf-8')
f.readline()
# TypeError: underlying read() should have returned a bytes-like object, not 'str'

f = codecs.iterencode(io.StringIO('Test\nTest\n'), 'utf-8')
next(f)
# This works, but it's an iterator instead of a file-like object or stream.

f = io.BytesIO(io.StringIO('Test\nTest\n').getvalue().encode('utf-8'))
f.readline()
# This works, but I'm reading the whole stream before converting it.

我正在使用Python 3.7

I'm using Python 3.7

推荐答案

您可以轻松地自己编写此代码；您只需要决定如何进行缓冲即可。

You can write this yourself pretty easily; you just need to decide how you want to do the buffering.

例如：

class BytesIOWrapper(io.RawIOBase):
    def __init__(self, file, encoding='utf-8', errors='strict'):
        self.file, self.encoding, self.errors = file, encoding, errors
        self.buf = b''
    def readinto(self, buf):
        if not self.buf:
            self.buf = self.file.read(4096).encode(self.encoding, self.errors)
            if not self.buf:
                return 0
        length = min(len(buf), len(self.buf))
        buf[:length] = self.buf[:length]
        self.buf = self.buf[length:]
        return length
    def readable():
        return True

我认为这正是您要的。

>>> f = BytesIOWrapper(io.StringIO("Test\nTest\n"))
>>> f.readline()
b'Test\n'
>>> f.readline()
b'Test\n'
>>> f.readline()
b''

如果您想变得更聪明，您可能希望包装 codecs.iterencode 而不是一次缓冲4K。或者，由于我们使用的是缓冲区，因此您可能要创建一个 BufferedIOBase 而不是一个 RawIOBase 。此外，名为 BytesIOWrapper 的类可能应该处理 write ，但这很容易。困难的部分是实现 seek / tell ，因为您不能在 TextIOBase ;寻求开始和结束非常容易；另一方面，要想知道以前的位置是很困难的（除非您依靠 TextIOBase.tell 返回一个字节位置，这是不保证的，并且 TextIOWrapper 可以， StringIO 不会...）。

If you want to get cleverer, you probably want to wrap a codecs.iterencode rather than buffering 4K at a time. Or, since we're using a buffer, you might want to create a BufferedIOBase instead of a RawIOBase. Also, a class named BytesIOWrapper probably ought to handle write, but that's the easy part. The hard part would be implementing seek/tell, since you can't seek arbitrarily within a TextIOBase; making seeking to start and end is pretty easy; seeking to known previous positions, on the other hand, is hard (unless you rely on the TextIOBase.tell returning a byte position—which it's not guaranteed to do, and, while TextIOWrapper does, StringIO doesn't…).

无论如何，我认为这是即使如何编写最复杂的 io 类的最简单的演示。

Anyway, I think this is the simplest demonstration of how to write even the most complicated kind of io class.

这篇关于如何在Python 3中将文本流编码为字节流？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Python 3中将文本流编码为字节流？ [英] How to encode a text stream into a byte stream in Python 3?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在Python 3中将文本流编码为字节流？ [英] How to encode a text stream into a byte stream in Python 3?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭