为什么在管道中读取和写入同一个文件会产生不可靠的结果? [英] Why does reading and writing to the same file in a pipeline produce unreliable results?

查看:18
本文介绍了为什么在管道中读取和写入同一个文件会产生不可靠的结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆包含许多空行的文件,我想删除任何重复的空行以使阅读文件更容易.我写了以下脚本:

I have a bunch a files that contain many blank lines, and want to remove any repeated blank lines to make reading the files easier. I wrote the following script:

#!/bin/bash
for file in * ; do cat "$file" | sed 's/^ +//' | cat -s > "$file" ; done

然而,这产生了非常不可靠的结果,大多数文件变成完全空的,只有少数文件具有预期的结果.更重要的是,每次我重试时,工作的文件似乎都会随机更改,因为每次运行时都会正确编辑不同的文件.怎么回事?

However, this had very unreliable results, with most files becoming completely empty and only a few files having the intended results. What's more, the files that did work seemed to change randomly every time I retried, as different files would get correctly edited in every run. What's going on?

注意:这更像是一个理论问题,因为我意识到我可以使用以下解决方法:

Note: This is more of a theoretical question, because I realize I could use a workaround like:

#!/bin/bash
for file in * ; do 
    cat "$file" | sed 's/^ +//' | cat -s > "$file"-tmp
    rm "$file"
    mv "$file"-tmp "$file"
done

但这似乎不必要地令人费解.那么为什么直接"方法如此不可靠?

But that seems unnecessarily convoluted. So why is the "direct" method so unreliable?

推荐答案

发生不可预测性是因为管道中的两个阶段 cat "$file"cat 之间存在竞争条件 ->$文件".

The unpredictability happens because there's a race condition between two stages in the pipeline, cat "$file" and cat -s > "$file".

第一个尝试打开文件并从中读取,而另一个尝试清空文件.

The first tries to open the file and read from it, while the other tries to empty the file.

  • 如果它在读取之前被清空,你会得到一个空文件.
  • 如果它在清空之前被读取,你会得到一些数据(但文件在不久之后被清空并且结果被截断,除非它很短).

如果你有 GNU sed,你可以简单地执行 sed -i 'expression' *

If you have GNU sed, you can simply do sed -i 'expression' *

这篇关于为什么在管道中读取和写入同一个文件会产生不可靠的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆