一个语义的bash脚本? [英] A semantics for Bash scripts?

查看:192
本文介绍了一个语义的bash脚本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比我知道的任何其他语言更多,我已经通过谷歌搜索每次我需要一些小东西的时间教训猛砸。因此,我可以拼凑,似乎一点工作在一起的脚本。不过,我不的真正的知道发生了什么事情,我希望的是更正式的介绍bash作为一种编程语言。例如:什么是评价秩序?什么是作用域规则?什么是打字纪律,例如一切是一个字符串?什么是程序的状态 - 它是字符串变量名的键值分配;有不止于此,例如堆栈?有堆?等等。

More than any other language I know, I've "learned" Bash by Googling every time I need some little thing. Consequently, I can patchwork together little scripts that appear to work. However, I don't really know what's going on, and I was hoping for a more formal introduction to Bash as a programming language. For example: What is the evaluation order? what are the scoping rules? What is the typing discipline, e.g. is everything a string? What is the state of the program -- is it a key-value assignment of strings to variable names; is there more than that, e.g. the stack? Is there a heap? And so on.

我想请教对于这种见解的GNU Bash的手册,但它似乎并没有被我想要的东西;它更多的语法糖而不是核心语义模型的解释一箩筐的。而万和一的bash教程网上只有更糟。或许我应该先学习 SH ,并了解bash作为在此之上的语法糖吗?我不知道这是一个精确的模型,虽然。

I thought to consult the GNU Bash manual for this kind of insight, but it doesn't seem to be what I want; it's more of a laundry list of syntactic sugar rather than an explanation of the core semantic model. The million-and-one "bash tutorials" online are only worse. Perhaps I should first study sh, and understand Bash as a syntactic sugar on top of this? I don't know if this is an accurate model, though.

有什么建议?

编辑:我被要求提供的理想是什么我要找的例子。什么,我会考虑形式语义一个比较极端的例子是本文对的JavaScript的本质。或许略少正式的例子是哈斯克尔2010年报告

I've been asked to provide examples of what ideally I'm looking for. A rather extreme example of what I would consider a "formal semantics" is this paper on "the essence of JavaScript". Perhaps a slightly less formal example is the Haskell 2010 report.

推荐答案

一个shell是操作系统的接口。它通常是在自己的权利更或多或少强大的编程语言,但设计可以​​很容易地与具体操作系统和文件系统进行交互的功能。该POSIX shell的(以下简称只是为壳)的语义是有点傻子的,结合LISP(S-EX pressions有很多共同的外壳的word分裂)和C(大部分外壳的算术语法语义来自C)。

A shell is an interface for the operating system. It is usually a more-or-less robust programming language in its own right, but with features designed to make it easy to interact specifically with the operating system and filesystem. The POSIX shell's (hereafter referred to just as "the shell") semantics are a bit of a mutt, combining some features of LISP (s-expressions have a lot in common with shell word splitting) and C (much of the shell's arithmetic syntax semantics comes from C).

shell的语法的另一根来自它的成长作为个人UNIX工具混杂。大部分东西往往是内建在shell实际上可以实现为外部命令。它会抛出许多新手外壳为一个循环,当他们意识到 /斌/ [存在在许多系统上。

The other root of the shell's syntax comes from its upbringing as a mishmash of individual UNIX utilities. Most of what are often builtins in the shell can actually be implemented as external commands. It throws many shell neophytes for a loop when they realize that /bin/[ exists on many systems.

$ if '/bin/[' -f '/bin/['; then echo t; fi # Tested as-is on OS X, without the `]`
t

笏?

这使得很多更有意义,如果你看一个shell是如何实现的。这是我做了作为一个练习的实现。这是在Python,但我希望这不是任何人挂断。这是不是非常强劲,但它是有启发:

This makes a lot more sense if you look at how a shell is implemented. Here's an implementation I did as an exercise. It's in Python, but I hope that's not a hangup for anyone. It's not terribly robust, but it is instructive:

#!/usr/bin/env python

from __future__ import print_function
import os, sys

'''Hacky barebones shell.'''

try:
  input=raw_input
except NameError:
  pass

def main():
  while True:
    cmd = input('prompt> ')
    args = cmd.split()
    if not args:
      continue
    cpid = os.fork()
    if cpid == 0:
      # We're in a child process
      os.execl(args[0], *args)
    else:
      os.waitpid(cpid, 0)

if __name__ == '__main__':
  main()

我希望以上明确指出,一个shell的执行模式是pretty多少:

I hope the above makes it clear that the execution model of a shell is pretty much:

1. Expand words.
2. Assume the first word is a command.
3. Execute that command with the following words as arguments.

扩展,命令解析,执行。所有shell的语义的结合起来的这三样东西之一,尽管他们远比我上面写的实施更加丰富。

Expansion, command resolution, execution. All of the shell's semantics are bound up in one of these three things, although they're far richer than the implementation I wrote above.

不是所有的命令。事实上,有一些不作的一吨的感觉的实现为外部(这样他们将不得不),但即使是那些通常可以作为外部严格符合POSIX标准。

Not all commands fork. In fact, there are a handful of commands that don't make a ton of sense implemented as externals (such that they would have to fork), but even those are often available as externals for strict POSIX compliance.

猛砸通过添加新功能和关键字,以提高POSIX外壳是建立在这个基础。它是与SH几乎兼容,和bash是无处不在,有些脚本作者去年没有意识到,一个脚本可能并不是一个真正的POSIXly严格的系统上工作。 (我也想知道人怎么那么在乎一种编程语言的语义和风格,还有一点点的语义和外壳的风格,但我岔开。)

Bash builds upon this base by adding new features and keywords to enhance the POSIX shell. It is nearly compatible with sh, and bash is so ubiquitous that some script authors go years without realizing that a script may not actually work on a POSIXly strict system. (I also wonder how people can care so much about the semantics and style of one programming language, and so little for the semantics and style of the shell, but I diverge.)

这是一个有点伎俩问题:在其主语法击跨$ P $点前pressions由左到右,但在其算术它的语法如下:C precedence。防爆pressions从的扩展的不同,虽然。从扩展在bash手册的部分:

This is a bit of a trick question: Bash interprets expressions in its primary syntax from left to right, but in its arithmetic syntax it follows C precedence. Expressions differ from expansions, though. From the EXPANSION section of the bash manual:

扩展的顺序是:括号扩展;波浪线扩展,参数
    和变量扩展,算术扩展和命令替换
    (在做了左到右的方式);词的拆分;和路径扩展。

The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.

如果你了解wordsplitting,路径扩展和参数扩展,你用自己的方式理解大部分做什么庆典。注意,路径扩展wordsplitting后未来是至关重要的,因为它确保在其名称空白的文件仍然可以由一个水珠匹配。这就是为什么用好水珠扩张的优于解析命令的,一般。

If you understand wordsplitting, pathname expansion and parameter expansion, you are well on your way to understanding most of what bash does. Note that pathname expansion coming after wordsplitting is critical, because it ensures that a file with whitespace in its name can still be matched by a glob. This is why good use of glob expansions is better than parsing commands, in general.

就像老ECMAScript中,外壳具有动态范围,除非你明确一个函数内声明的名称。

Much like old ECMAscript, the shell has dynamic scope unless you explicitly declare names within a function.

$ foo() { echo $x; }
$ bar() { local x; echo $x; }
$ foo

$ bar

$ x=123
$ foo
123
$ bar

$ …

环境和流程范围

子shell继承其父炮弹的变数,但其他种类的进程不继承未导出的名字。

Environment and process "scope"

Subshells inherit the variables of their parent shells, but other kinds of processes don't inherit unexported names.

$ x=123
$ ( echo $x )
123
$ bash -c 'echo $x'

$ export x
$ bash -c 'echo $x'
123
$ y=123 bash -c 'echo $y' # another way to transiently export a name
123

您可以组合这些范围的规则:

You can combine these scoping rules:

$ foo() {
>   local -x bar=123 # Export foo, but only in this scope
>   bash -c 'echo $bar'
> }
$ foo
123
$ echo $bar

$

键入纪律

嗯,类型。是啊。巴什真的没有类型,一切都扩展为一个字符串(或者一个的的会更合适。)但是让我们来看看不同类型的扩展的。

Typing discipline

Um, types. Yeah. Bash really doesn't have types, and everything expands to a string (or perhaps a word would be more appropriate.) But let's examine the different types of expansions.

pretty多的东西可以被视为一个字符串。在bash裸字是字符串,其意义完全取决于应用到它的扩展。

Pretty much anything can be treated as a string. Barewords in bash are strings whose meaning depends entirely on the expansion applied to it.

这可能是值得要证明一个裸字真的只是一个词,那行情变化一无所知的。

It may be worthwhile to demonstrate that a bare word really is just a word, and that quotes change nothing about that.

$ echo foo
foo
$ 'echo' foo
foo
$ "echo" foo
foo

扩展子

$ fail='echoes'
$ set -x # So we can see what's going on
$ "${fail:0:-2}" Hello World
+ echo Hello World
Hello World

有关更多的扩展,阅读参数扩展的手册部分。这是相当强大的。

For more on expansions, read the Parameter Expansion section of the manual. It's quite powerful.

您可以用整数属性灌输名告诉shell将分配前pressions的右手边为算术。然后,当参数扩展它会被作为整数运算扩展到...字符串之前计算。

You can imbue names with the integer attribute to tell the shell to treat the right hand side of assignment expressions as arithmetic. Then, when the parameter expands it will be evaluated as integer math before expanding to … a string.

$ foo=10+10
$ echo $foo
10+10
$ declare -i foo
$ foo=$foo # Must re-evaluate the assignment
$ echo $foo
20
$ echo "${foo:0:1}" # Still just a string
2

阵列

参数和位置参数

在谈论阵列它可能是值得商榷的位置参数。一个shell脚本的参数可以使用编号的参数, $ 1 $ 2 $ 3可以访问等,您可以一次使用$ @访问所有这些参数,它扩大在很多共同点与阵列。您可以设置并使用设置内建改变位置参数,或者干脆通过调用壳或壳这些参数功能:

Arrays

Arguments and Positional Parameters

Before talking about arrays it might be worth discussing positional parameters. The arguments to a shell script can be accessed using numbered parameters, $1, $2, $3, etc. You can access all these parameters at once using "$@", which expansion has many things in common with arrays. You can set and change the positional parameters using the set or shift builtins, or simply by invoking the shell or a shell function with these parameters:

$ bash -c 'for ((i=1;i<=$#;i++)); do
>   printf "\$%d => %s\n" "$i" "${@:i:1}"
> done' -- foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showpp() {
>   local i
>   for ((i=1;i<=$#;i++)); do
>     printf '$%d => %s\n' "$i" "${@:i:1}"
>   done
> }
$ showpp foo bar baz
$1 => foo
$2 => bar
$3 => baz
$ showshift() {
>   shift 3
>   showpp "$@"
> }
$ showshift foo bar baz biz quux xyzzy
$1 => biz
$2 => quux
$3 => xyzzy

bash的手动有时也指 $ 1,0 作为位置参数。我觉得这是令人困惑,因为它不包括在参数计数 $#,但它是一个编号的参数,所以咩。 $ 1,0 是shell或当前shell脚本的名称。

The bash manual also sometimes refers to $0 as a positional parameter. I find this confusing, because it doesn't include it in the argument count $#, but it is a numbered parameter, so meh. $0 is the name of the shell or the current shell script.

阵列的语法位置参数为蓝本,所以它主要是健康的思考阵列作为命名一种外部位置参数,如果你喜欢。阵列可以使用以下方法来宣称:

The syntax of arrays is modeled after positional parameters, so it's mostly healthy to think of arrays as a named kind of "external positional parameters", if you like. Arrays can be declared using the following approaches:

$ foo=( element0 element1 element2 )
$ bar[3]=element3
$ baz=( [12]=element12 [0]=element0 )

您可以通过索引访问数组元素:

You can access array elements by index:

$ echo "${foo[1]}"
element1

您可以切片数组:

$ printf '"%s"\n' "${foo[@]:1}"
"element1"
"element2"

如果你把一个数组作为一个正常的参数,你会得到零指标。

If you treat an array as a normal parameter, you'll get the zeroth index.

$ echo "$baz"
element0
$ echo "$bar" # Even if the zeroth index isn't set

$ …

如果您使用引号或反斜杠prevent wordsplitting,阵列将保持指定的wordsplitting:

If you use quotes or backslashes to prevent wordsplitting, the array will maintain the specified wordsplitting:

$ foo=( 'elementa b c' 'd e f' )
$ echo "${#foo[@]}"
2

阵列和位置参数之间的主要区别是:

The main difference between arrays and positional parameters are:


  1. 位置参数不疏。如果港币$ 96 设置,你可以肯定的 $ 11个设置了。 (它可以被设置为空字符串,但 $#将不小于12)如果$ {改编[12]} 设置,但不保证$ {改编[11]}设置,而数组的长度可以像小1。

  2. 数组的第零个元素是明确该数组的第零个元素。在位置参数,第零个元素是不是的第一个参数的,但外壳或shell脚本的名称。

  3. 一个数组,你必须切片和重新分配它,就像 ARR =($ {ARR [@]:1} )。你也可以做未设置ARR [0] ,但是这将使第一个元素位于索引1。

  4. 数组可以外壳功能之间隐含共享全局的,但你必须位置参数明确地传递给shell函数为它看到这些。

  1. Positional parameters are not sparse. If $12 is set, you can be sure $11 is set, too. (It could be set to the empty string, but $# will not be smaller than 12.) If "${arr[12]}" is set, there's no guarantee that "${arr[11]}" is set, and the length of the array could be as small as 1.
  2. The zeroth element of an array is unambiguously the zeroth element of that array. In positional parameters, the zeroth element is not the first argument, but the name of the shell or shell script.
  3. To shift an array, you have to slice and reassign it, like arr=( "${arr[@]:1}" ). You could also do unset arr[0], but that would make the first element at index 1.
  4. Arrays can be shared implicitly between shell functions as globals, but you have to explicitly pass positional parameters to a shell function for it to see those.

这是通常方便地使用路径扩展到创建文件名的数组:

It's often convenient to use pathname expansions to create arrays of filenames:

$ dirs=( */ )

命令

命令是关键,但他们也包括在深度更好比我的手册。阅读 SHELL语法部分。不同种类的命令是:

Commands

Commands are key, but they're also covered in better depth than I can by the manual. Read the SHELL GRAMMAR section. The different kinds of commands are:


  1. 的简单命令(例如 $运行startx

  2. 管道(例如 $是|用make config )(笑)

  3. 列表(例如 $ grep的-qf foo的文件和放大器;&安培; SED的/富/酒吧/文件&gt;的newfile

  4. 复合命令(例如 $(CD-P /无功/网络/ Webroot公司和放大器;&安培;回声根目录为$ PWD)

  5. 协进程(复杂,没有例子)

  6. 功能(可被视为一个简单的命令命名的复合命令)

  1. Simple Commands (e.g. $ startx)
  2. Pipelines (e.g. $ yes | make config) (lol)
  3. Lists (e.g. $ grep -qF foo file && sed 's/foo/bar/' file > newfile)
  4. Compound Commands (e.g. $ ( cd -P /var/www/webroot && echo "webroot is $PWD" ))
  5. Coprocesses (Complex, no example)
  6. Functions (A named compound command that can be treated as a simple command)

课程的执行模型包含了一个堆和栈。这是特有的所有UNIX程序。击也有shell函数调用堆栈,通过嵌套使用主叫内置可见。

Execution Model

The execution model of course involves both a heap and a stack. This is endemic to all UNIX programs. Bash also has a call stack for shell functions, visible via nested use of the caller builtin.

参考文献:


  1. 在bash手册的 SHELL语法部分

  2. shell命令行语言文档

  3. 在Greycat的wiki的猛砸指南

  4. 在UNIX环境
  5. 高级编程

  1. The SHELL GRAMMAR section of the bash manual
  2. The XCU Shell Command Language documentation
  3. The Bash Guide on Greycat's wiki.
  4. Advanced Programming in the UNIX Environment

如果你要我在特定方向进一步扩大请评论。

Please make comments if you want me to expand further in a specific direction.

这篇关于一个语义的bash脚本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆