什么时候命令替换会产生比相同命令更多的子shell? [英] When does command substitution spawn more subshells than the same commands in isolation?

查看:19
本文介绍了什么时候命令替换会产生比相同命令更多的子shell?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨天有人建议我在 bash 中使用命令替换会导致生成一个不必要的子 shell.该建议专门针对此用例:

Yesterday it was suggested to me that using command substitution in bash causes an unnecessary subshell to be spawned. The advice was specific to this use case:

# Extra subshell spawned
foo=$(command; echo $?)

# No extra subshell
command
foo=$?

尽我所能,这对于这个用例来说似乎是正确的.然而,试图验证这一点的快速搜索会导致大量令人困惑和矛盾的建议.似乎流行的智慧说所有命令替换的使用都会产生一个子shell.例如:

As best I can figure this appears to be correct for this use case. However, a quick search trying to verify this leads to reams of confusing and contradictory advice. It seems popular wisdom says ALL usage of command substitution will spawn a subshell. For example:

命令替换扩展到命令的输出.这些命令在子 shell 中执行,它们的标准输出数据是替换语法扩展到的.(来源)

The command substitution expands to the output of commands. These commands are executed in a subshell, and their stdout data is what the substitution syntax expands to. (source)

除非您继续挖掘,否则这似乎很简单,在这种情况下,您将开始找到对并非如此的建议的参考.

This seems simple enough unless you keep digging, on which case you'll start finding references to suggestions that this is not the case.

命令替换不一定会调用子shell,并且在大多数情况下不会.它唯一保证的是乱序求值:它简单地首先求值替换中的表达式,然后使用替换的结果求值周围的语句.(来源)

Command substitution does not necessarily invoke a subshell, and in most cases won't. The only thing it guarantees is out-of-order evaluation: it simply evaluates the expressions inside the substitution first, then evaluates the surrounding statement using the results of the substitution. (source)

这看起来很有道理,但这是真的吗?这个与子shell相关问题的答案让我知道man bash需要注意:

This seems reasonable, but is it true? This answer to a subshell related question tipped me off that man bash has this to note:

管道中的每个命令都作为一个单独的进程(即在子shell中)执行.

Each command in a pipeline is executed as a separate process (i.e., in a subshell).

这让我想到了主要问题.究竟是什么会导致命令替换产生一个无论如何都不会产生以单独执行相同命令的子shell?

This brings me to the main question. What, exactly, will cause command substitution to spawn a subshell that would not have been spawned anyway to execute the same commands in isolation?

请考虑以下情况并解释哪些会导致额外子 shell 的开销:

Please consider the following cases and explain which ones incur the overhead of an extra subshell:

# Case #1
command1
var=$(command1)

# Case #2
command1 | command2
var=$(command1 | command2)

# Case #3
command1 | command 2 ; var=$?
var=$(command1 | command2 ; echo $?)

这些对中的每一个是否都会产生相同数量的子shell来执行?POSIX 与 bash 实现有区别吗?在其他情况下,使用命令替换会产生一个子shell,而单独运行相同的命令集不会吗?

Do each of these pairs incur the same number of subshells to execute? Is there a difference in POSIX vs. bash implementations? Are there other cases where using command substitution would spawn a subshell where running the same set of commands in isolation would not?

推荐答案

更新和注意事项:

这个答案有一个麻烦的过去,因为我自信地声称事实并非如此.我相信它的当前形式有价值,但请帮助我消除其他不准确之处(或说服我应该完全删除它).

This answer has a troubled past in that I confidently claimed things that turned out not to be true. I believe it has value in its current form, but please help me eliminate other inaccuracies (or convince me that it should be deleted altogether).

在@kojiro 指出我的测试方法有缺陷(我最初使用 ps 来寻找子进程之后,我已经大幅修改了这个答案,而且大部分都被删掉了),但这太慢了,无法始终检测到它们);下面介绍一种新的测试方法.

I've substantially revised - and mostly gutted - this answer after @kojiro pointed out that my testing methods were flawed (I originally used ps to look for child processes, but that's too slow to always detect them); a new testing method is described below.

我最初声称并非所有 bash 子 shell 都在它们自己的子进程中运行,但事实证明并非如此.

I originally claimed that not all bash subshells run in their own child process, but that turns out not to be true.

正如@kojiro 在他的回答中所说的那样,一些 shell - 除了 bash - 有时会避免为子 shell 创建子进程,所以,通常在壳,不应假设子壳暗示子进程.

As @kojiro states in his answer, some shells - other than bash - DO sometimes avoid creation of child processes for subshells, so, generally speaking in the world of shells, one should not assume that a subshell implies a child process.

至于 bash 中的 OP 案例(假设 command{n} 实例是简单命令):

As for the OP's cases in bash (assumes that command{n} instances are simple commands):

# Case #1
command1         # NO subshell
var=$(command1)  # 1 subshell (command substitution)

# Case #2
command1 | command2         # 2 subshells (1 for each pipeline segment)
var=$(command1 | command2)  # 3 subshells: + 1 for command subst.

# Case #3
command1 | command2 ; var=$?         # 2 subshells (due to the pipeline)
var=$(command1 | command2 ; echo $?) # 3 subshells: + 1 for command subst.;
                                     #   note that the extra command doesn't add 
                                     #   one

看起来像使用命令替换 ($(...)) 总是在 bash 中添加一个额外的子外壳 - 就像在 (...).

It looks like using command substitution ($(...)) always adds an extra subshell in bash - as does enclosing any command in (...).

我相信,但不确定这些结果是否正确;这是我的测试方式(OS X 10.9.1 上的 bash 3.2.51) - 请告诉我这种方法是否有缺陷:

I believe, but am not certain these results are correct; here's how I tested (bash 3.2.51 on OS X 10.9.1) - please tell me if this approach is flawed:

  • 确保只有 2 个交互式 bash shell 正在运行:一个用于运行命令,另一个用于监控.
  • 在第二个 shell 中,我使用 sudo dtruss -t fork -f -p {pidOfShell1}(-f 还需要可传递地"跟踪 fork() 调用,即包括那些由子 shell 本身创建的调用).
  • 在测试命令中仅使用内置的 :(无操作)(以避免使用额外的 fork() 调用外部可执行文件来混淆图片);具体:

  • Made sure only 2 interactive bash shells were running: one to run the commands, the other to monitor.
  • In the 2nd shell I monitored the fork() calls in the 1st with sudo dtruss -t fork -f -p {pidOfShell1} (the -f is necessary to also trace fork() calls "transitively", i.e. to include those created by subshells themselves).
  • Used only the builtin : (no-op) in the test commands (to avoid muddling the picture with additional fork() calls for external executables); specifically:

  • :
  • $(:)
  • <代码>: |:
  • $(: | :)
  • <代码>: |:;:
  • $(: | :; :)

只计算那些包含非零 PID 的 dtruss 输出行(因为每个子进程也报告创建它的 fork() 调用,但是PID 为 0).

Only counted those dtruss output lines that contained a non-zero PID (as each child process also reports the fork() call that created it, but with PID 0).

以下是我从我的原始帖子中仍然认为是正确的:当 bash 创建子外壳时.

Below is what I still believe to be correct from my original post: when bash creates subshells.

bash 在以下情况下创建子 shell:

  • 用于括号括起来的表达式 ( (...) )
    • 除了直接在[[ ... ]]中,括号仅用于逻辑分组.
    • for an expression surrounded by parentheses ( (...) )
      • except directly inside [[ ... ]], where parentheses are only used for logical grouping.
      • 用于管道的每个段(|),包括第一个
        • 请注意,每个涉及的子shell在内容方面都是原始shell的克隆(从进程上看,子shell可以从其他子shell(在执行命令之前)).
          因此,在早期管道段中修改子外壳不会影响后面的.
          (按照设计,管道中的命令同时启动 - 排序仅通过它们连接的 stdin/stdout 管道发生.)
        • bash 4.2+ 有 shell 选项 lastpipe(默认关闭),这会导致 last 管道段不在子 shell 中运行.
        • for every segment of a pipeline (|), including the first one
          • Note that every subshell involved is a clone of the original shell in terms of content (process-wise, subshells can be forked from other subshells (before commands are executed)).
            Thus, modifications of subshells in earlier pipeline segments do not affect later ones.
            (By design, commands in a pipeline are launched simultaneously - sequencing only happens through their connected stdin/stdout pipes.)
          • bash 4.2+ has shell option lastpipe (OFF by default), which causes the last pipeline segment NOT to run in a subshell.
          • 用于命令替换 ($(...))

          用于进程替换 (<(...))

          • typically creates 2 subshells; in the case of a simple command, @konsolebox came up with a technique to only create 1: prepend the simple command with exec (<(exec ...)).
          • 后台执行(&)

          组合这些结构将产生不止一个子shell.

          Combining these constructs will result in more than one subshell.

          这篇关于什么时候命令替换会产生比相同命令更多的子shell?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆