bash命令替换给予奇怪的输出不一致 [英] Bash Command Substitution Giving Weird Inconsistent Output

查看:95
本文介绍了bash命令替换给予奇怪的输出不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有关这个问题不相关的一些原因,我在没有直接bash脚本,而是通过命令替换运行的Java服务器不同的子shell下,并在后台。这样做的目的是为子返回Java服务器的进程ID作为其标准输出。有问题的fragement如下:

For some reasons not relevant to this question, I am running a Java server in a bash script not directly but via command substitution under a separate sub-shell, and in the background. The intent is for the subcommand to return the process id of the Java server as its standard output. The fragement in question is as follows:

launch_daemon()
{
  /bin/bash <<EOF
     $JAVA_HOME/bin/java $JAVA_OPTS -jar $JAR_FILE daemon $PWD/config/cl.yml <&- &
     pid=\$!
     echo \${pid} > $PID_FILE
     echo \${pid}   
EOF
}

daemon_pid=$(launch_daemon)

echo ${daemon_pid} > check.out

问题输出到标准错误,Java的守护进程,并退出如果在初始化的问题,否则将关闭标准输出和标准错误,并继续在它的途中。后来在脚本(未显示)我做了检查,以确保服务器进程正在运行。现在到这个问题。

The Java daemon in question prints to standard error and quits if there is a problem in initialization, otherwise it closes standard out and standard err and continues on its way. Later in the script (not shown) I do a check to make sure the server process is running. Now on to the problem.

每当我检查上面的$ PID_FILE,它包含在一行正确的进程ID。

Whenever I check the $PID_FILE above, it contains the correct process id on one line.

但是,当我检查文件check.out,它有时会包含正确的ID,其他时候,它包含一个空格charcater在分隔为同一行重复两次的进程ID:

But when I check the file check.out, it sometimes contains the correct id, other times it contains the process id repeated twice on the same line separated by a space charcater as in:

34056 34056

我使用的变量$ daemon_pid在上面的脚本以后的脚本来检查服务器是否正在运行,所以如果它包含PID重复两次,这是完全抛出了测试,它错误地认为服务器没有运行。对通过将更多的echo语句等运行CentOS的Linux的我的服务器框中键入脚本摆弄似乎翻转行为返回该进程的ID只有一次$ daemon_pid的正确的,但如果我认为有固定它,并为您在这个脚本我的源$ C ​​$ C回购,并做了构建和重新部署,我开始看到同样的不良行为。

I am using the variable $daemon_pid in the script above later on in the script to check if the server is running, so if it contains the pid repeated twice this totally throws off the test and it incorrectly thinks the server is not running. Fiddling with the script on my server box running CentOS Linux by putting in more echo statements etc. seems to flip the behavior back to the correct one of $daemon_pid containing the process id just once, but if I think that has fixed it and check in this script to my source code repo and do a build and deploy again, I start seeing the same bad behavior.

有关现在我已经通过假设$ daemon_pid可能是坏的,将其通过AWK固定如下这样:

For now I have fixed this by assuming that $daemon_pid could be bad and passing it through awk as follows:

mypid=$(echo ${daemon_pid} | awk '{ gsub(" +.*",""); print $0 }')

然后$ mypid总是包含正确的进程ID,一切都很好,但不用说,我想知道为什么它的行为方式是这样。你提问之前,我看了又看,但有问题的Java服务器关闭标准之前不会打印它的进程ID到标准输出。

Then $mypid always contains the correct process id and things are fine, but needless to say I'd like to understand why it behaves the way it does. And before you ask, I have looked and looked but the Java server in question does NOT print its process id to its standard out before closing standard out.

真的AP preciate专家的意见。

Would really appreciate expert input.

推荐答案

继@WilliamPursell的提示,我在bash中源$ C ​​$ C跟踪下来。老实说,我不知道它是否是一个错误或没有;所有我能说的是,它似乎是一个可疑的情况下,使用一个不幸的互动。

Following the hint by @WilliamPursell, I tracked this down in the bash source code. I honestly don't know whether it is a bug or not; all I can say is that it seems like an unfortunate interaction with a questionable use case.

TL; DR:;&放大器;你可以通过删除&LT解决问题 - 从剧本

TL;DR: You can fix the problem by removing <&- from the script.

关闭标准输入充其量可疑的,不只是由@JonathanLeffler提到的原因(程序有权有一个标准输入这是开放的。),但更重要的是因为标准输入正在使用由庆典过程本身和关闭它在后台引起的竞争条件。

Closing stdin is at best questionable, not just for the reason mentioned by @JonathanLeffler ("Programs are entitled to have a standard input that's open.") but more importantly because stdin is being used by the bash process itself and closing it in the background causes a race condition.

为了看看这是怎么回事,请考虑以下相当奇怪的脚本,它可以被称为达夫的猛砸设备,只是我不知道,即使达夫将批准:(还有如presented,它不是这是有用的。但有人的地方已经用它在一些黑客攻击。或者,如果没有,它们将现在他们看到它。)

In order to see what's going on, consider the following rather odd script, which might be called Duff's Bash Device, except that I'm not sure that even Duff would approve: (also, as presented, it's not that useful. But someone somewhere has used it in some hack. Or, if not, they will now that they see it.)

/bin/bash <<EOF
if (($1<8)); then head -n-$1 > /dev/null; fi
echo eight
echo seven
echo six
echo five
echo four
echo three
echo two
echo one
EOF

对于这项工作,庆典都已经被prepared分享标准输入,其中包括共享的文件位置。这意味着,庆典需要确保它刷新其读取缓存器(或不缓存),和需要确保它试图回到它使用输入的部分的端部

For this to work, bash and head both have to be prepared to share stdin, including sharing the file position. That means that bash needs to make sure that it flushes its read buffer (or not buffer), and head needs to make sure that it seeks back to the end of the part of the input which it uses.

(该黑客只是工作,因为庆典通过将其复制到一个临时文件句柄这里的文档。如果用一个管道,它不会有可能为寻求倒退。)

(The hack only works because bash handles here-documents by copying them into a temporary file. If it used a pipe, it wouldn't be possible for head to seek backwards.)

现在,什么会发生,如果已经在后台运行?答案是,只是一切皆有可能,因为庆典竞相从同一个文件中读取描述符。运行在后台将是一个非常糟糕的主意,比原来黑客是至少predictable更糟。

Now, what would have happened if head had run in the background? The answer is, "just about anything is possible", because bash and head are racing to read from the same file descriptor. Running head in the background would be a really bad idea, even worse than the original hack which is at least predictable.

现在,让我们回到手头的实际程序,简化其要领:

Now, let's go back to the actual program at hand, simplified to its essentials:

/bin/bash <<EOF
cmd <&- &
echo \$!
EOF

本程序( CMD&LT;&放大器; - &安培; )的2号线叉掉一个单独的进程(在后台运行)。在这个过程中,它会关闭标准输入,然后调用 CMD

Line 2 of this program (cmd <&- &) forks off a separate process (to run in the background). In that process, it closes stdin and then invokes cmd.

同时,前台进程从标准输入继续读取命令(其标准输入 FD尚未关闭,所以这很好),这导致它执行回声命令。

Meanwhile, the foreground process continues reading commands from stdin (its stdin fd hasn't been closed, so that's fine), which causes it to execute the echo command.

现在,这里的难题是:庆典知道它需要共享标准输入,所以它不能只是关闭标准输入。它需要确保标准输入的文件位置指向正确的地方,尽管它实际上可能提前读取输入缓冲区的价值。它关闭标准输入因此,只要之前,向后寻求在当前命令行的末尾。 [1]

Now here's the rub: bash knows that it needs to share stdin, so it can't just close stdin. It needs to make sure that stdin's file position is pointing to the right place, even though it may have actually read ahead a buffer's worth of input. So just before it closes stdin, it seeks backwards to the end of the current command line. [1]

如果寻求前景的bash执行之前发生回声,那么就没有问题。而如果它发生后前景bash以这里的文档做的,也没问题。但是,如果它发生的当回声工作的?在这种情况下,在回声完成后,庆典将重读回声命令,因为标准输入已倒,而回声将再次执行。

If that seek happens before the foreground bash executes echo, then there is no problem. And if it happens after the foreground bash is done with the here-document, also no problem. But what if it happens while the echo is working? In that case, after the echo is done, bash will reread the echo command because stdin has been rewound, and the echo will be executed again.

这就是precisely什么在OP发生。有时,背景寻求完成在错误的时间,并导致回声\\ $ {PID} 来被执行两次。事实上,这也导致回声\\ $ {PID}&GT; $ PID_FILE 执行两次,但该行是幂等;如果当时回声\\ $ {PID}&GT;&GT; $ PID_FILE ,双执行将是可见的。

And that's precisely what is happening in the OP. Sometimes, the background seek completes at just the wrong time, and causes echo \${pid} to be executed twice. In fact, it also causes echo \${pid} > $PID_FILE to execute twice, but that line is idempotent; had it been echo \${pid} >> $PID_FILE, the double execution would have been visible.

因此​​,解决方法很简单:删除&LT;&安培; - 从服务器启动线,以及可选的&LT更换; /开发/空如果你想确保服务器不能从标准输入阅读。

So the solution is simple: remove <&- from the server start-up line, and optionally replace it with </dev/null if you want to make sure the server can't read from stdin.

注:

注1:对于那些更熟悉的bash源$ C ​​$ C,比我的预期的行为,我认为,寻求和关闭发生在案件结束r_close_this:函数 do_redirection_internal redir.c ,约行1093:

Note 1: For those more familiar with bash source code and its expected behaviour than I am, I believe that the seek and close takes place at the end of case r_close_this: in function do_redirection_internal in redir.c, at approximately line 1093:

check_bash_input (redirector);
close_buffered_fd (redirector);

第一次调用将 lseek的,第二个做的关闭。我看到了使用 strace的-f 的行为,然后搜查了code代表一个合理的期待 lseek的,但我没去在调试器验证的麻烦。

The first call does the lseek and the second one does the close. I saw the behaviour using strace -f and then searched the code for a plausible looking lseek, but I didn't go to the trouble of verifying in a debugger.

这篇关于bash命令替换给予奇怪的输出不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆