在非事务性文件系统中实现原子文件写入 [英] Implementing atomic file writes in a nontransactional filesystem

查看:110
本文介绍了在非事务性文件系统中实现原子文件写入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

许多常见的文件系统不提供原子操作,但在某些情况下以原子方式编写文件非常重要。我试图想出一个解决这个问题的方法。

我做了以下假设:

$ ul $ b

  • 正在使用的文件系统支持inode级别的原子操作(例如NTFS)。这意味着 move delete 是原子的。

  • 只有程序本身访问文件。
  • 一次只有一个程序实例,它以单线程方式运行。
  • 为了简单起见,每次都写入整个文件内容(即truncate-写入)。


    这会导致以下问题:写入文件时,程序可能被中断,文件将只剩下一部分内容写入。

    我建议如下过程:


    1. 将新内容写入临时文件 将原始文件原始移动到临时位置备份 >

    2. 移至原始

    备份文件可以区别于原始 (例如,它们可以有不同的前缀,或c应该在同一个卷上的一个单独的目录中)。同时,他们的名字应该直接映射到相应的 Original (例如通过简单地使用相同的文件名)。

    然而,这并没有使操作成为原子。该过程可能会被中断步骤1,2,3或4:
    $ b $ ol

  • 留下可能不完整的

  • 移动是原子操作,但目标文件现在丢失。移动是原子化的,但有一个未使用的备份 >。 内容取代

  • 删除是原子性的。 b
    $ b

    使用前面的假设2和3,程序必须在崩溃后重新启动。在启动过程中,它应该执行这些恢复检查:


    • 如果存在 New / em>不是,我们在步骤1中或之后坠毁,因为它可能不完整,所以删除 New 如果 Backup 存在,但是 新的 / em>不是,我们在步骤3之后崩溃。继续执行步骤4.


    恢复过程本身只使用原子操作,将会在被中断后停止。

    我相信这个想法可以确保单个程序的原子写入。这些问题仍然存在:


    • 当使用同一程序的多个实例时,恢复过程与当前正在进行的文件写入另一个程序。
    • 只读但不写的外部程序通常会得到正确的结果,但如果在请求的入口上同时有一个写入操作,它们可能会错误地找到没有条目。


      这些问题(被前面的假设排除)可以通过使用策略来解决(例如,检查其他实例,并拒绝目录访问其他用户)。

      最后,我的问题:这是否有道理,或在这个过程中有缺陷?是否有任何问题阻止了这种方法在实践中的使用? 解决方案

  • 您的步骤可以进一步简化:


    $ b


    1. 将新内容写入临时文件新建

    2. 删除原始文件
    3. 将新文件移动到原始位置

    启动时:


    1. 如果原始文件不存在,但是新建文件,则在第3步之前将进程中断,将新建文件移动到原始文件。 ,删除新。

    我已经用这个来管理配置文件,并且从来没有遇到过这个过程的问题。 b $ b

    Many common filesystems do not offer atomic operations, yet writing files in an atomic manner is very important in certain scenarios. I tried to come up with a solution for this problem.

    I made the following assumptions:

    • The filesystem in use supports atomic operations at inode level (for instance, NTFS). This means that move and delete are atomic.
    • Only the program itself accesses the files.
    • There is only 1 instance of the program at a time and it acts in a single-threaded manner.
    • For simplicity, the whole file content is written each time (i.e. truncate-write).

    This leaves the following problem: While writing a file, the program could be interrupted and the file would be left with only a part of the content to write.

    I propose the following process:

    1. Write new content to a temporary file New
    2. Move the original file Original to a temporary location Backup
    3. Move New to Original
    4. Delete Backup

    New and Backup files are distinguishable from Original files (for instance, they could be prefixed differently, or could be in a separate directory on the same volume). At the same time, their name should map directly to the corresponding Original (for instance by simply using the same file name).

    This, however, does not make the operation atomic yet. The process could be interrupted steps 1, 2, 3 or 4:

    1. Leaves a potentially incomplete New.
    2. Move is atomic, but the target file is now missing. Both New and Backup exist and are complete.
    3. Move is atomic, but there is an unused Backup. The Original was replaced by the New content
    4. Deletion is atomic.

    Using assumptions 2 and 3 from earlier, the program has to be restarted after a crash. During the startup process, it should perform these recovery checks:

    • If New exists but Backup does not, we crashed in or after step 1. Delete New since it could be incomplete.
    • If New exists and Backup does too, we crashed after step 2. Continue with step 3.
    • If Backup exists but New does not, too, we crashed after step 3. Continue with step 4.

    The recovery process itself, only using atomic operations, will simply continue where it left off after being interrupted.

    I believe this idea ensures atomic writes for a single program. These issues exist still:

    • When using multiple instances of the same program, there is an interference of the recovery process with currently ongoing file writes in the other program.
    • Outside programs that only read but never write will usually get the correct result, but if there is a write operation on the requested entry at the same time, they may incorrectly find no entry.

    Those issues (which are excluded by the assumptions earlier) could be solved via usage policy (for instance, check for other instances, and deny directory access to other users).

    Finally, my question: Did that make sense, or is there a flaw in the process? Are there any issues that prevent this approach from being used in practice?

    解决方案

    Your steps can be simplified further:

    1. Write new content to a temporary file New
    2. Delete Original file
    3. Move New to Original

    On startup:

    1. If Original does not exist but New does, the process was interrupted before step 3, move New to Original.
    2. If Original and New both exist, the process was interrupted before step 2, delete New.

    I have used this in managing configuration files and have never encountered a problem from this process.

    这篇关于在非事务性文件系统中实现原子文件写入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆