Python的file.flush()到底在做什么? [英] What exactly is Python's file.flush() doing?

查看:279
本文介绍了Python的file.flush()到底在做什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Python 文件对象的文档中找到了这个:

flush()不一定会将文件的数据写入磁盘.使用flush()和os.fsync()来确保此行为.

flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.

所以我的问题是:Python的flush到底在做什么?我以为这会强制将数据写入磁盘,但现在我知道并没有.为什么?

So my question is: what exactly is Python's flush doing? I thought that it forces to write data to the disk, but now I see that it doesn't. Why?

推荐答案

通常涉及两个级别的缓冲:

There's typically two levels of buffering involved:

  1. 内部缓冲区
  2. 操作系统缓冲区

内部缓冲区是由您要针对其进行编程的运行时/库/语言创建的缓冲区,旨在通过避免每次写入都调用系统来加快处理速度.相反,当您写入文件对象时,您将写入其缓冲区,并且只要缓冲区填满,就会使用系统调用将数据写入实际文件.

The internal buffers are buffers created by the runtime/library/language that you're programming against and is meant to speed things up by avoiding system calls for every write. Instead, when you write to a file object, you write into its buffer, and whenever the buffer fills up, the data is written to the actual file using system calls.

但是,由于操作系统缓冲区的原因,这可能并不意味着将数据写入磁盘.这可能只是意味着将数据从运行时维护的缓冲区复制到操作系统维护的缓冲区.

However, due to the operating system buffers, this might not mean that the data is written to disk. It may just mean that the data is copied from the buffers maintained by your runtime into the buffers maintained by the operating system.

如果您写了一些东西,并且它最终出现在缓冲区中(仅),并且切断了计算机的电源,则当计算机关闭时,该数据将不在磁盘上.

If you write something, and it ends up in the buffer (only), and the power is cut to your machine, that data is not on disk when the machine turns off.

因此,为了帮助您在各自的对象上使用flushfsync方法.

So, in order to help with that you have the flush and fsync methods, on their respective objects.

第一个,flush,只会将程序缓冲区中残留的所有数据写到实际文件中.通常,这意味着数据将从程序缓冲区复制到操作系统缓冲区.

The first, flush, will simply write out any data that lingers in a program buffer to the actual file. Typically this means that the data will be copied from the program buffer to the operating system buffer.

具体来说,这意味着如果另一个进程打开了要读取的相同文件,它将能够访问刚刚刷新到该文件的数据.但是,这不一定意味着它已永久"存储在磁盘上.

Specifically what this means is that if another process has that same file open for reading, it will be able to access the data you just flushed to the file. However, it does not necessarily mean it has been "permanently" stored on disk.

为此,您需要调用os.fsync方法,该方法确保所有操作系统缓冲区都与它们所使用的存储设备同步,换句话说,该方法会将数据从操作系统缓冲区复制到磁盘上.

To do that, you need to call the os.fsync method which ensures all operating system buffers are synchronized with the storage devices they're for, in other words, that method will copy data from the operating system buffers to the disk.

通常不需要打扰任何一种方法,但是如果您处于对实际存储在磁盘上的偏执狂的场景中,则应该按照指示进行两次调用.

Typically you don't need to bother with either method, but if you're in a scenario where paranoia about what actually ends up on disk is a good thing, you should make both calls as instructed.

2018年附录

请注意,具有缓存机制的磁盘现在比2013年更加普遍,因此现在涉及的缓存和缓冲区级别更高.我假设这些缓冲区也将由sync/flush调用处理,但我真的不知道.

Note that disks with cache mechanisms is now much more common than back in 2013, so now there are even more levels of caching and buffers involved. I assume these buffers will be handled by the sync/flush calls as well, but I don't really know.

这篇关于Python的file.flush()到底在做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆