分解shell脚本;引擎盖下会发生什么? [英] Breaking down shell scripts; What happens under the hood?

查看:67
本文介绍了分解shell脚本;引擎盖下会发生什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我得到了这一行脚本:

So, I was given this one line script:

echo test | cat | grep test

在下面的系统调用下,您能否向我解释一下它到底将如何工作:pipe(),fork(),exec()和dup2()?

Could you please explain to me how exactly that would work given the following system calls: pipe(), fork(), exec() and dup2()?

我正在这里寻找一般概述,主要是操作顺序. 到目前为止,我所知道的是外壳将使用fork()进行分叉,而脚本的代码将通过使用exec()来替换外壳的代码.但是管道和dup2呢?它们如何落在适当的位置?

I am looking for an general overview here and mainly the sequence of operations. What I know so far is that the shell will fork using fork() and the script's code will replace the shell's one by using the exec(). But what about pipe and dup2? How do they fall in place?

谢谢.

推荐答案

首先考虑一个简单的示例,例如:

First consider a simpler example, such as:

echo test | cat

我们想要的是在一个单独的进程中执行echo,将其标准输出安排到执行cat的进程的标准输入中.理想情况下,这种转移一旦完成设置,便无需外壳进一步干预-外壳会平静地等待两个进程退出.

What we want is to execute echo in a separate process, arranging for its standard output to be diverted into the standard input of the process executing cat. Ideally this diversion, once setup, would require no further intervention by the shell — the shell would just calmly wait for both processes to exit.

实现这一目标的机制称为管道".它是一种在内核中实现并导出到用户空间的进程间通信设备.管道一旦由Unix程序创建,便会出现一对具有特殊属性的文件描述符,如果将它们写入其中一个,则可以从另一个读取相同的数据.这在同一过程中不是很有用,但请记住,文件描述符(包括但不限于管道)是跨fork()甚至跨exec()继承的.这使管道易于建立并且具有相当高效的IPC机制.

The mechanism to achieve that is called the "pipe". It is an interprocess communication device implemented in the kernel and exported to the user-space. Once created by a Unix program, a pipe has the appearance of a pair of file descriptors with the peculiar property that, if you write into one of them, you can read the same data from the other. This is not very useful within the same process, but keep in mind that file descriptors, including but not limited to pipes, are inherited across fork() and even accross exec(). This makes pipe an easy to set up and reasonably efficient IPC mechanism.

shell创建管道,现在拥有一组属于管道的文件描述符,一个用于读取,一个用于写入.这些文件描述符由两个分支子进程继承.现在,仅当echo正在写入管道的写入结束描述符而不是其实际标准输出,并且cat是从管道的读取结束描述符而不是从其标准输入中读取时,一切都将起作用.但是他们没有,这就是dup2发挥作用的地方.

The shell creates the pipe, and now owns a set of file descriptors belonging to the pipe, one for reading and one for writing. These file descriptors are inherited by both forked subprocesses. Now only if echo were writing to the pipe's write-end descriptor instead of to its actual standard output, and if cat were reading from the pipe's read-end descriptor instead of from its standard input, everything would work. But they don't, and this is where dup2 comes into play.

dup2将一个文件描述符复制为另一个文件描述符,并自动自动关闭新的描述符.例如,dup2(1, 15)将关闭文件描述符1(按惯例用于标准输出),并重新打开它作为文件描述符15的副本-这意味着写入标准输出实际上等同于写入文件描述符15 .读取同样如此:dup2(0, 8)将使从文件描述符0(标准输入)的读取等同于从文件描述符8的读取.如果我们继续关闭原始文件描述符,则打开的文件(或管道)将具有已有效地从原始描述符转移到了新的描述符,就像科幻传送一样,它首先在远程位置复制了一个问题,然后分解了原始的碎片.

dup2 duplicates a file descriptor as another file descriptor, automatically closing the new descriptor beforehand. For example, dup2(1, 15) will close file descriptor 1 (by convention used for the standard output), and reopen it as a copy of file descriptor 15 — meaning that writing to the standard output will in fact be equivalent to writing to file descriptor 15. The same applies to reading: dup2(0, 8) will make reading from file descriptor 0 (the standard input) equivalent to reading from file descriptor 8. If we proceed to close the original file descriptor, the open file (or a pipe) will have been effectively moved from the original descriptor to the new one, much like sci-fi teleports that work by first duplicating a piece of matter at a remote location and then disintegrating the original.

如果您仍然遵循该理论,则应该清楚由Shell执行的操作顺序:

If you're still following the theory, the order of operations performed by the shell should now be clear:

  1. shell创建一个管道,然后创建fork两个进程,这两个进程都将继承管道文件描述符rw.

  1. The shell creates a pipe and then fork two processes, both of which will inherit the pipe file descriptors, r and w.

在要执行echo的子进程中,shell调用exec之前的dup2(1, w); close(w)以便将标准输出重定向到管道的写端.

In the subprocess about to execute echo, the shell calls dup2(1, w); close(w) before exec in order to redirect the standard output to the write end of the pipe.

在要执行cat的子过程中,shell调用dup2(0, r); close(r)以便将标准输入重定向到管道的读取端.

In the subprocess about to execute cat, the shell calls dup2(0, r); close(r) in order to redirect the standard input to the read end of the pipe.

分叉后,主外壳程序本身必须关闭管道的两端.原因之一是一旦子流程退出,释放与管道相关的资源.另一个是允许cat实际上终止—管道的读取器仅在管道的写入端的所有副本关闭后才会收到EOF.在上述步骤中,我们确实关闭了子代写端的冗余副本,即文件描述符15,将其复制到1之后.但是文件描述符15也必须存在于父代中,因为它是在该数字下继承的,并且可以仅由父母关闭.否则,cat的标准输入将永远不会报告EOF,因此cat进程将因此挂起.

After forking, the main shell process must itself close both ends of the pipe. One reason is to free up resources associated with the pipe once subprocesses exit. The other is to allow cat to actually terminate — a pipe's reader will receive EOF only after all copies of the write end of the pipe are closed. In steps above, we did close the child's redundant copy of the write end, the file descriptor 15, right after its duplication to 1. But the file descriptor 15 must also exist in the parent, because it was inherited under that number, and can only be closed by the parent. Failing to do that leaves cat's standard input never reporting EOF, and its cat process hanging as a consequence.

此机制很容易推广到通过管道连接的三个或更多进程.对于三个过程,管道需要安排echo的输出写入cat的输入,而cat的输出写入grep的输入.这需要两次调用pipe(),三个调用fork(),四个调用dup2()close(一个调用echogrep和两个调用cat),三个调用exec() ,以及对close()的四个附加调用(每个管道两个).

This mechanism is easily generalized it to three or more processes connected by pipes. In case of three processes, the pipes need to arrange that echo's output writes to cat's input, and cat's output writes to grep's input. This requires two calls to pipe(), three calls to fork(), four calls to dup2() and close (one for echo and grep and two for cat), three calls to exec(), and four additional calls to close() (two for each pipe).

这篇关于分解shell脚本;引擎盖下会发生什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆