`write(2)` 对本地文件系统的原子性 [英] Atomicity of `write(2)` to a local filesystem

查看:16
本文介绍了`write(2)` 对本地文件系统的原子性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

显然 POSIX 声明

<块引用>

文件描述符或流在打开它所引用的文件描述;打开的文件描述可能有几个手柄.[…] 应用程序的所有活动影响第一个句柄上的文件偏移量应暂停直到它再次成为活动文件句柄.[…] 手柄需要不在同一过程中适用这些规则.-- POSIX.1-2008

<块引用>

如果两个线程各自调用 [write() 函数],则每次调用应要么看到另一个调用的所有指定效果,要么没有其中.-- POSIX.1-2008

我对此的理解是,当第一个进程发出一个write(handle, data1, size1) 和第二个进程问题write(handle, data2, size2),写入可以按任何顺序发生,但data1data2 必须既原始又连续.

但是运行下面的代码给了我意想不到的结果.

#include <errno.h>#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <unistd.h>#include <sys/wait.h>死(字符 * s){错误;中止();}主要的(){无符号字符缓冲区[3];char *filename = "/tmp/atomic-write.log";整数 fd, i, j;pid_t pid;取消链接(文件名);/* XXX 将 O_APPEND 添加到标志可以治愈它.为什么?*/fd = 打开(文件名,O_CREAT|O_WRONLY/*|O_APPEND*/,0644);如果 (fd <0)die("打开失败");对于 (i = 0; i <10; i++) {pid = fork();如果 (pid <0)die("fork 失败");否则如果(!pid){j = 3 + i % (sizeof(buffer) - 2);memset(buffer, i % 26 + 'A', sizeof(buffer));缓冲区[0] = '-';缓冲区[j - 1] = '
';对于 (i = 0; i <1000; i++)如果(写(fd,缓冲区,j)!= j)die("写入失败");退出(0);}}而(等待(NULL)!= -1)/* NOOP */;退出(0);}

我尝试在 Linux 和 Mac OS X 10.7.4 上运行它并使用 grep -a'^[^-]|^..*-'/tmp/atomic-write.log 表明有些写入不是连续或重叠 (Linux) 或完全损坏 (Mac OS X).

open(2) 调用中添加标志 O_APPEND 可以解决此问题问题.很好,但我不明白为什么.POSIX 说

<块引用>

O_APPEND如果设置,则文件偏移量应设置为每次写入之前的文件末尾.

但这不是这里的问题.我的示例程序从来没有lseek(2) 但共享相同的文件描述和相同的文件偏移量.

我已经在 Stackoverflow 上阅读过类似的问题,但它们仍然存在不要完全回答我的问题.

从两个进程对文件进行原子写入并没有具体说明解决进程共享相同文件描述的情况(而不是同一个文件).

如何是否有人以编程方式确定写入"系统调用是否对特定文件是原子的?

<块引用>

POSIX 中定义的 write 调用根本没有原子性保证.

但是作为 上面引用的确实有一些.更重要的是,O_APPEND 似乎触发了这种原子性保证,尽管它看起来对我来说,即使没有 O_APPEND 也应该存在这种保证.

您能否进一步解释这种行为?

解决方案

man 2 write on my system 总结的很好:

<块引用>

请注意,并非所有文件系统都符合 POSIX.

这是最近 ext4 邮件列表上的nofollow noreferrer">讨论:

<块引用><块引用>

目前并发读/写是原子的,只针对单个页面,但是不在系统调用上.这可能会导致 read() 返回数据混合了几种不同的写入,我认为这不好方法.我们可能会争辩说这样做的应用程序被破坏了,但是实际上,这是我们可以在文件系统级别轻松完成的事情,而无需显着的性能问题,因此我们可以保持一致.还有POSIX也提到了这一点,并且 XFS 文件系统已经具有此功能.

这清楚地表明 ext4 - 仅命名一个现代文件系统 - 在这方面不符合 POSIX.1-2008.

Apparently POSIX states that

Either a file descriptor or a stream is called a "handle" on the open file description to which it refers; an open file description may have several handles. […] All activity by the application affecting the file offset on the first handle shall be suspended until it again becomes the active file handle. […] The handles need not be in the same process for these rules to apply. -- POSIX.1-2008

and

If two threads each call [the write() function], each call shall either see all of the specified effects of the other call, or none of them. -- POSIX.1-2008

My understanding of this is that when the first process issues a write(handle, data1, size1) and the second process issues write(handle, data2, size2), the writes can occur in any order but the data1 and data2 must be both pristine and contiguous.

But running the following code gives me unexpected results.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/wait.h>
die(char *s)
{
  perror(s);
  abort();
}

main()
{
  unsigned char buffer[3];
  char *filename = "/tmp/atomic-write.log";
  int fd, i, j;
  pid_t pid;
  unlink(filename);
  /* XXX Adding O_APPEND to the flags cures it. Why? */
  fd = open(filename, O_CREAT|O_WRONLY/*|O_APPEND*/, 0644);
  if (fd < 0)
    die("open failed");
  for (i = 0; i < 10; i++) {
    pid = fork();
    if (pid < 0)
      die("fork failed");
    else if (! pid) {
      j = 3 + i % (sizeof(buffer) - 2);
      memset(buffer, i % 26 + 'A', sizeof(buffer));
      buffer[0] = '-';
      buffer[j - 1] = '
';
      for (i = 0; i < 1000; i++)
        if (write(fd, buffer, j) != j)
          die("write failed");
      exit(0);
    }
  }
  while (wait(NULL) != -1)
    /* NOOP */;
  exit(0);
}

I tried running this on Linux and Mac OS X 10.7.4 and using grep -a '^[^-]|^..*-' /tmp/atomic-write.log shows that some writes are not contiguous or overlap (Linux) or plain corrupted (Mac OS X).

Adding the flag O_APPEND in the open(2) call fixes this problem. Nice, but I do not understand why. POSIX says

O_APPEND If set, the file offset shall be set to the end of the file prior to each write.

but this is not the problem here. My sample program never does lseek(2) but share the same file description and thus same file offset.

I have already read similar questions on Stackoverflow but they still do not fully answer my question.

Atomic write on file from two process does not specifically address the case where the processes share the same file description (as opposed to the same file).

How does one programmatically determine if "write" system call is atomic on a particular file? says that

The write call as defined in POSIX has no atomicity guarantee at all.

But as cited above it does have some. And what’s more, O_APPEND seems to trigger this atomicity guarantee although it seems to me that this guarantee should be present even without O_APPEND.

Can you explain further this behaviour ?

解决方案

man 2 write on my system sums it up nicely:

Note that not all file systems are POSIX conforming.

Here is a quote from a recent discussion on the ext4 mailing list:

Currently concurrent reads/writes are atomic only wrt individual pages, however are not on the system call. This may cause read() to return data mixed from several different writes, which I do not think it is good approach. We might argue that application doing this is broken, but actually this is something we can easily do on filesystem level without significant performance issues, so we can be consistent. Also POSIX mentions this as well and XFS filesystem already has this feature.

This is a clear indication that ext4 -- to name just one modern filesystem -- doesn't conform to POSIX.1-2008 in this respect.

这篇关于`write(2)` 对本地文件系统的原子性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆