类似 Perl 的速记字符类在括号表达式中不起作用 [英] Perl-like shorthand character class not working inside bracket expression
问题描述
s
似乎不适用于
sed 's/[s]+//' tempfile
当它正在工作时
sed 's/[ ]+//' tempfile
我正在尝试删除由于命令而出现在每行开头的空格:
I am trying to remove white spaces that are coming at the beginning of each line due to the command:
nl -s ') ' file > tempfile
例如文件:
A Storm of Swords, George R. R. Martin, 1216
The Two Towers, J. R. R. Tolkien, 352
The Alchemist, Paulo Coelho, 197
The Fellowship of the Ring, J. R. R. Tolkien, 432
The Pilgrimage, Paulo Coelho, 288
A Game of Thrones, George R. R. Martin, 864
临时文件:
1) Storm of Sword, George R. R. Martin, 1216
2) The Two Tower, J. R. R. Tolkien, 352
3) The Alchemit, Paulo Coelho, 197
4) The Fellowhip of the Ring, J. R. R. Tolkien, 432
5) The Pilgrimage, Paulo Coelho, 288
6) A Game of Throne, George R. R. Martin, 864
即数字前有空格
请解释为什么会出现空格以及 s
不起作用的原因.
Please explain why the white spaces are coming and the reason for s
to not work.
推荐答案
原因很简单:POSIX 正则表达式引擎不会解析类似 Perl 类的速记字符类在括号表达式内.
The reason is simple: POSIX regex engine does not parse shorthand Perl-like character classes as such inside bracket expressions.
请参阅此参考资料:
一个关键的语法差异是反斜杠不是 POSIX 括号表达式中的元字符.所以在 POSIX 中,正则表达式 [d]
匹配一个 或一个
d
.
One key syntactic difference is that the backslash is NOT a metacharacter in a POSIX bracket expression. So in POSIX, the regular expression
[d]
matches aor a
d
.
因此,POSIX 正则表达式中的 [s]
匹配两个符号之一: 或
s
.
So, [s]
in a POSIX regex matches one of two symbols: either or
s
.
考虑以下演示:
echo 'absc' | sed 's/[s]+//'
输出是abc
.s
子串被删除.
Output is abc
. s
substring is removed.
考虑使用 POSIX 字符类而不是类似 Perl 的速记:
Consider using POSIX character classes instead of Perl-like shorthands:
echo 'abs c' | sed 's/[[:space:]]+//'
请参阅此在线演示(输出为absc
).POSIX 字符类由 [:<NAME_OF_CLASS>:]
组成,它们只能在括号表达式中使用.请参阅此处为 POSIX 字符类的更多示例.
See this online demo (the output is absc
). The POSIX character classes are made of [:<NAME_OF_CLASS>:]
, and they can only be used inside bracket expressions. See more examples of POSIX character classes here.
注意:如果要确保删除行首的空格,请在模式开头添加^
:
NOTE: if you want to make sure the spaces at the start of the line are removed, add ^
at the pattern start:
sed 's/^[[:space:]]+//'
^
更多模式:
w
=[[:alnum:]_]
W
=[^[:alnum:]_]
d
=[[:digit:]]
(或[0-9]
)D
=[^[:digit:]]
(或[^0-9]
)h
=[[:blank:]]
S
=[^[:space:]]
w
=[[:alnum:]_]
W
=[^[:alnum:]_]
d
=[[:digit:]]
(or[0-9]
)D
=[^[:digit:]]
(or[^0-9]
)h
=[[:blank:]]
S
=[^[:space:]]
这篇关于类似 Perl 的速记字符类在括号表达式中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!