在bash脚本中,$'\ 0'的值是什么,为什么? [英] In a bash script, what would $'\0' evaluate to and why?

查看:200
本文介绍了在bash脚本中,$'\ 0'的值是什么,为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在各种bash脚本中,我遇到以下内容:$'\0'

带有上下文的示例:

while read -r -d $'\0' line; do
    echo "${line}"
done <<< "${some_variable}"

$'\ 0'返回的值是什么?或者,换句话说,$'\ 0'的计算结果是什么?为什么?

这可能已在其他地方得到解答.我在发布之前确实进行了搜索,但是由于美元/引号-斜杠-零引号中的字符或有意义的单词数量有限,因此很难从stackoverflow搜索或google中获取结果.因此,如果还有其他重复的问题,请稍加宽容并从该问题中链接它们.

解决方案

补充 rici的有用答案:

请注意,此答案与bash 有关. kshzsh也支持$'...'字符串,但是它们的行为不同:
* zsh确实使用$'\0' 创建和保留NUL(空字节).
*相反, ksh 具有与bash相同的限制,并且另外将命令替换输出中的第一个NUL解释为字符串终止符(会在第一个NUL处断开,而bash 这样的NUL).

$'\0' ANSI C- 技术上用引号引起来的字符串 创建了一个NUL(0x0字节),但是 有效却导致了一个空(空)字符串(与'')相同,因为Bash在参数和here-docs/here-strings上下文中将任何NUL解释为(C样式)字符串终止符.

因此,使用$'\0'有点误导的,因为它表明您可以创建NUL,而实际上却不能:

  • 不能创建NUL 作为命令自变量 here-doc/here-string 的一部分strong>,并且您 不能将NUL存储在变量中:

    • echo $'a\0b' | cat -v # -> 'a'-'a'后的字符串终止
    • cat -v <<<$'a\0b' # -> 'a'-同上
  • 相比之下,
  • 命令替换的情况下, NUL被剥离 :

    • echo "$(printf 'a\0b')" | cat -v # -> 'ab'-NUL被剥离
  • 但是,您 可以通过文件管道传递NUL 个字节.

    • printf 'a\0b' | cat -v # -> 'a^@b'-通过stdout和管道将NUL保留
    • 请注意,正是printf通过其单引号参数生成了NUL,其转义序列printf然后解释并写入了stdout.相反,如果使用printf $'a\0b',则bash会再次将NUL解释为字符串终止符,并且仅将'a'传递给printf.

如果我们检查示例代码,其 意图旨在一次读取整个输入 , >(因此,我将line更改为content):

while read -r -d $'\0' content; do  # same as: `while read -r -d '' ...`
    echo "${content}"
done <<< "${some_variable}"

这将从不进入while循环正文,因为stdin输入由 here-string 提供,如前所述,不能包含NUL.
请注意,即使$'\0'实际上是''
read实际上确实查找具有-d $'\0'的NUL. 换句话说:read 按照惯例将空字符串(null)解释为将NUL表示为-d的选项参数,因为出于技术原因不能指定NUL本身. /sup>

在输入中没有实际的NUL的情况下,read的退出代码表示失败,因此永远不会进入循环.

但是,即使没有定界符,其值也是 read ,因此要使此代码与here-string或here-doc一起使用,必须修改如下:

while read -r -d $'\0' content || [[ -n $content ]]; do
    echo "${content}"
done <<< "${some_variable}"

但是,正如@rici在评论中指出的那样,使用(多行)输入字符串,完全不需要使用while :

read -r -d $'\0' content <<< "${some_variable}"

这将读取$some_variable的全部内容,同时修剪前导和尾随空格(这是read$IFS的默认值$' \t\n'上执行的操作).
@rici还指出,如果不希望进行这种修整,则只需一个简单的content=$some_variable即可.

将此与实际上包含NUL的输入进行对比,在这种情况下,需要while 来处理每个由NUL分隔的令牌(但不包含|| [[ -n $<var> ]]子句) ; find -print0输出以NUL分隔的文件名):

while IFS= read -r -d $'\0' file; do
    echo "${file}"
done < <(find . -print0)

请注意使用IFS= read ...来抑制对前导和尾随空格的修剪,在这种情况下,这是不希望的,因为必须按原样保留输入文件名.

In various bash scripts I have come across the following: $'\0'

An example with some context:

while read -r -d $'\0' line; do
    echo "${line}"
done <<< "${some_variable}"

What does $'\0' return as its value? Or, stated slightly differently, what does $'\0' evaluate to and why?

It is possible that this has been answered elsewhere. I did search prior to posting but the limited number of characters or meaningful words in dollar-quote-slash-zero-quote makes it very hard to get results from stackoverflow search or google. So, if there are other duplicate questions, please allow some grace and link them from this question.

解决方案

To complement rici's helpful answer:

Note that this answer is about bash. ksh and zsh also support $'...' strings, but their behavior differs:
* zsh does create and preserve NUL (null bytes) with $'\0'.
* ksh, by contrast, has the same limitations as bash, and additionally interprets the first NUL in a command substitution's output as the string terminator (cuts off at the first NUL, whereas bash strips such NULs).

$'\0' is an ANSI C-quoted string that technically creates a NUL (0x0 byte), but effectively results in the empty (null) string (same as ''), because any NUL is interpreted as the (C-style) string terminator by Bash in the context of arguments and here-docs/here-strings.

As such, it is somewhat misleading to use $'\0' because it suggests that you can create a NUL this way, when you actually cannot:

  • You cannot create NULs as part of a command argument or here-doc / here-string, and you cannot store NULs in a variable:

    • echo $'a\0b' | cat -v # -> 'a' - string terminated after 'a'
    • cat -v <<<$'a\0b' # -> 'a' - ditto
  • In the context of command substitutions, by contrast, NULs are stripped:

    • echo "$(printf 'a\0b')" | cat -v # -> 'ab' - NUL is stripped
  • However, you can pass NUL bytes via files and pipes.

    • printf 'a\0b' | cat -v # -> 'a^@b' - NUL is preserved, via stdout and pipe
    • Note that it is printf that is generating the NUL via its single-quoted argument whose escape sequences printf then interprets and writes to stdout. By contrast, if you used printf $'a\0b', bash would again interpret the NUL as the string terminator up front and pass only 'a' to printf.

If we examine the sample code, whose intent is to read the entire input at once, across lines (I've therefore changed line to content):

while read -r -d $'\0' content; do  # same as: `while read -r -d '' ...`
    echo "${content}"
done <<< "${some_variable}"

This will never enter the while loop body, because stdin input is provided by a here-string, which, as explained, cannot contain NULs.
Note that read actually does look for NULs with -d $'\0', even though $'\0' is effectively ''. In other words: read by convention interprets the empty (null) string to mean NUL as -d's option-argument, because NUL itself cannot be specified for technical reasons.

In the absence of an actual NUL in the input, read's exit code indicates failure, so the loop is never entered.

However, even in the absence of the delimiter, the value is read, so to make this code work with a here-string or here-doc, it must be modified as follows:

while read -r -d $'\0' content || [[ -n $content ]]; do
    echo "${content}"
done <<< "${some_variable}"

However, as @rici notes in a comment, with a single (multi-line) input string, there is no need to use while at all:

read -r -d $'\0' content <<< "${some_variable}"

This reads the entire content of $some_variable, while trimming leading and trailing whitespace (which is what read does with $IFS at its default value, $' \t\n').
@rici also points out that if such trimming weren't desired, a simple content=$some_variable would do.

Contrast this with input that actually contains NULs, in which case while is needed to process each NUL-separated token (but without the || [[ -n $<var> ]] clause); find -print0 outputs filenames separated by a NUL each):

while IFS= read -r -d $'\0' file; do
    echo "${file}"
done < <(find . -print0)

Note the use of IFS= read ... to suppress trimming of leading and trailing whitespace, which is undesired in this case, because input filenames must be preserved as-is.

这篇关于在bash脚本中,$'\ 0'的值是什么,为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆