在字符串上运行sed,使用“echo”的好处+“管道” over"<<<<< [英] Running sed on string, benefit of using "echo" + "pipe" over "<<<"
问题描述
通常我看到人们使用sed操纵字符串如下:
Commonly I see people manipulating strings using sed as follows:
echo "./asdf" | sed -n -e "s%./%%p"
我最近了解到我也可以做:
I recently learned I can also do:
sed -n -e "s%./%%p" <<< "./asdf"
有理由避免后者吗?
例如,它是特定于bash的行为吗?
Is there a reason to avoid the latter? For instance, is it bash-specific behaviour?
推荐答案
我应该如何修剪。 /
从路径的开头(或执行其他简单的字符串操作)?
Bash的内置语法称为参数扩展。 $ {s#。/}
将使用任何前导扩展
在shell内部修剪,没有子进程或其他开销。 BashFAQ#100 涵盖了许多其他字符串操作操作。 $ s
./
How should I trim ./
from the beginning of a path (or perform other simple string manipulations)?
Bash's built-in syntax for this is called parameter expansion. ${s#./}
will expand $s
with any leading ./
trimmed internal to the shell, with no subprocess or other overhead. BashFAQ #100 covers many additional string manipulation operations.
-
便携性
Portability
如您所知,<<< $ c:c>在POSIX sh中不可用;这是一个ksh扩展,也可以在bash和zsh中使用。
As you've noted, <<<
is not available in POSIX sh; this is a ksh extension also available in bash and zsh.
这就是说,如果你需要可移植性,那么多行等价物并不遥远:
That said, if you need portability, the multiline equivalent is not far away:
... <<EOF
$s
EOF
磁盘使用情况
Disk usage
目前由bash实现(以及作为实施细节可能会有变化) ,<<<<
创建一个临时文件,填充,重新定向并重定向。如果您的 TEMPDIR
不在内存文件系统中,这可能会更慢,或者可能会产生客户流失。
As currently implemented by bash (and as an implementation detail subject to change), <<<
creates a temporary file, populates, it, and redirects from it. If your TEMPDIR
is not on an in-memory filesystem, this may be slower, or may generate churn.
流程开销
管道,如 echo foo | ...
,创建一个子shell - 它会激活一个全新的进程,负责运行 echo
然后退出。当你运行 result = $(echo$ s| ...)
时,那个管道本身就在你的父shell的子shell中,并且那个 shell的输出是由父级读取的。
A pipeline, as in echo foo | ...
, creates a subshell -- it forks off a completely new process, responsible for running echo
and then exiting. When you're running result=$(echo "$s" | ...)
, then that pipeline is itself in a subshell of your parent shell, and that shell has its output read by the parent.
现代unixlikes付出了巨大努力使 fork()
尽可能地降低子进程的低开销,但即使这样,它也可以在循环中完成的操作中加起来 - 而在Cygwin等平台上它可能更加重要。
Modern unixlikes go to significant effort to make fork()
ing off a subprocess low-overhead to the extent possible, but even then it can add up when in an operation done in a loop -- and on platforms such as Cygwin it can be even more significant.
echo
错误
最后但并非最不重要 - <<<$ s
将精确地表示变量 s
的任何内容,但它除外可以添加尾随换行符。相比之下, echo
在其指定的行为中有很大的余地:它可以遵循反斜杠扩展或不符合标准的可选XSI扩展的符合性(和存在或缺少 -e
的广泛但完全不合规的扩展,和/或禁用它的运行时标志);标准不保证能够避免使用 -n
添加尾随换行符;和C。即使您使用的是管道,最好使用 printf
:
Last but not least -- <<<"$s"
will represent any contents of the variable s
precisely, with the exception that it can add a trailing newline. By contrast, echo
has a great deal of leeway in its specified behavior: It can honor backslash expansions or not depending on compliance with the optional XSI extensions to the standard (and presence or lack of the widespread but entirely noncompliant extension of -e
, and/or runtime flags that disable it); the ability to avoid addition of trailing newlines with -n
is not guaranteed by the standard; &c. Even if you're using a pipeline, it's better to use printf
:
# emit *exactly* the contents of "$s", with no newline added
printf '%s' "$s" | ...
# emit the contents of "$s", with an added trailing newline
printf '%s\n' "$s" | ...
# emit the contents of "$s", with '\t', '\n', '\b' &c replaced, and no added newline
printf '%b' "$s" | ...
这篇关于在字符串上运行sed,使用“echo”的好处+“管道” over"<<<<<的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!