二进制流中`open`和`io.BytesIO`之间的区别 [英] Difference between `open` and `io.BytesIO` in binary streams

查看:1952
本文介绍了二进制流中`open`和`io.BytesIO`之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何使用Python中的流,我注意到 IO docs 说出以下内容:

I'm learning about working with streams in Python and I noticed that the IO docs say the following:


创建二进制流的最简单方法是在模式字符串中使用open()和'b' :

The easiest way to create a binary stream is with open() with 'b' in the mode string:

f = open(myfile.jpg,rb)

内存中的二进制流也可用作BytesIO对象:

In-memory binary streams are also available as BytesIO objects:

f = io.BytesIO(bsome初始二进制数据:\ x00 \ x01)

<$ c之间有什么区别$ c> f 由 open f 定义,由<$ c $定义C> BytesIO 。换句话说,什么是内存中的二进制流,它与 open 的不同之处是什么?

What is the difference between f as defined by open and f as defined by BytesIO. In other words, what makes a "In-memory binary stream" and how is that different from what open does?

推荐答案

为简单起见,我们现在考虑写作而不是阅读。

For simplicity's sake, let's consider writing instead of reading for now.

所以当你使用 open()喜欢说:

with open("test.dat", "wb") as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

执行后,将创建名为 test.dat 的文件,包含 Hello World 。在将数据写入文件后,数据不会保留在内存中(除非由名称保存)。

After executing that a file called test.dat will be created, containing Hello World. The data wont be kept in memory after it's written to the file (unless being kept by a name).

现在考虑 io.BytesIO ()而是:

with io.BytesIO() as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

这不是将内容写入文件,而是写入内存缓冲区。换句话说,一块RAM。基本上写下面的内容是等价的:

Which instead of writing the contents to a file, it's written to an in memory buffer. In other words a chunk of RAM. Essentially writing the following would be the equivalent:

buffer = b""
buffer += b"Hello World"
buffer += b"Hello World"
buffer += b"Hello World"

关于带有with语句的示例,最后还会有一个 del buffer

这里的关键区别是优化和性能。 io.BytesIO 能够做一些优化,比简单地连接所有 bHello World更快一个。

The key difference here is optimization and performance. io.BytesIO is able to do some optimizations that makes it faster than simply concatenating all the b"Hello World" one by one.

只是为了证明这是一个小基准:

Just to prove it here's a small benchmark:


  • Concat :1.3529秒

  • BytesIO:0.0090秒

import io
import time

begin = time.time()
buffer = b""
for i in range(0, 50000):
    buffer += b"Hello World"
end = time.time()
seconds = end - begin
print("Concat:", seconds)

begin = time.time()
buffer = io.BytesIO()
for i in range(0, 50000):
    buffer.write(b"Hello World")
end = time.time()
seconds = end - begin
print("BytesIO:", seconds)

除了使用 BytesIO 而不是连接的性能提升。是否可以使用 BytesIO 代替文件对象。所以说你有一个期望文件对象写入的函数。然后你可以给它那个内存缓冲区。

Besides the performance gain of using BytesIO instead of concatenating. Is that BytesIO can be used in place of a file object. So say you have a function that expects a file object to write to. Then you can give it that in-memory buffer instead.

当谈到 open(myfile.jpg,rb)只需加载并返回 myfile.jpg 的内容。其中 BytesIO 再次只是一个包含一些数据的缓冲区。

When it comes to open("myfile.jpg", "rb") that simply loads and returns the contents of myfile.jpg. Where BytesIO again just is a buffer containing some data.

由于 BytesIO 只是一个缓冲区,如果你想稍后将内容写入文件,你必须这样做:

Since BytesIO is just a buffer, if you wanted to write the contents to a file later, you'd have to do:

buffer = io.BytesIO()
# ...
with open("test.dat", "wb") as f:
    f.write(buffer.getvalue())

此外,由于你没有提到版本,我使用的是Python 3。与示例相关,只是我使用的是with语句,而不是调用 f.close()

这篇关于二进制流中`open`和`io.BytesIO`之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆