类似Perl的速记字符类在括号表达式中不起作用 [英] Perl-like shorthand character class not working inside bracket expression

查看:88
本文介绍了类似Perl的速记字符类在括号表达式中不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

\s似乎不兼容

sed 's/[\s]\+//' tempfile

在工作时

sed 's/[ ]\+//' tempfile

由于以下命令,我试图删除每行开头出现的空格:

I am trying to remove white spaces that are coming at the beginning of each line due to the command:

nl -s ') ' file > tempfile  

例如文件:

A Storm of Swords, George R. R. Martin, 1216
The Two Towers, J. R. R. Tolkien, 352
The Alchemist, Paulo Coelho, 197
The Fellowship of the Ring, J. R. R. Tolkien, 432
The Pilgrimage, Paulo Coelho, 288
A Game of Thrones, George R. R. Martin, 864

临时文件:

 1) Storm of Sword, George R. R. Martin, 1216
 2) The Two Tower, J. R. R. Tolkien, 352
 3) The Alchemit, Paulo Coelho, 197
 4) The Fellowhip of the Ring, J. R. R. Tolkien, 432
 5) The Pilgrimage, Paulo Coelho, 288
 6) A Game of Throne, George R. R. Martin, 864

即数字前有空格

请解释为什么空白会出现以及\s无法正常工作的原因.

Please explain why the white spaces are coming and the reason for \s to not work.

推荐答案

原因很简单:POSIX正则表达式引擎不会解析括号内的表达式等速记式Perl字符类.

The reason is simple: POSIX regex engine does not parse shorthand Perl-like character classes as such inside bracket expressions.

请参见此参考:

一个关键的语法差异是反斜杠不是POSIX括号表达式中的元字符.因此,在POSIX中,正则表达式[\d]匹配\d.

因此,POSIX正则表达式中的[\s]与两个符号之一匹配:\s.

So, [\s] in a POSIX regex matches one of two symbols: either \ or s.

考虑以下演示:

echo 'ab\sc' | sed 's/[\s]\+//'

输出为abc. \s子字符串已删除.

Output is abc. \s substring is removed.

考虑使用POSIX字符类而不是类似Perl的速记:

Consider using POSIX character classes instead of Perl-like shorthands:

echo 'ab\s c' | sed 's/[[:space:]]\+//'

请参见此在线演示(输出为ab\sc). POSIX字符类由[:<NAME_OF_CLASS>:]组成,并且只能在方括号表达式内使用.请参见更多POSIX字符类示例,.

See this online demo (the output is ab\sc). The POSIX character classes are made of [:<NAME_OF_CLASS>:], and they can only be used inside bracket expressions. See more examples of POSIX character classes here.

注意:如果要确保删除行首的空格,请在模式开头添加^:

NOTE: if you want to make sure the spaces at the start of the line are removed, add ^ at the pattern start:

sed 's/^[[:space:]]\+//'
       ^ 

更多样式:

  • \w = [[:alnum:]_]
  • \W = [^[:alnum:]_]
  • \d = [[:digit:]](或[0-9])
  • \D = [^[:digit:]](或[^0-9])
  • \h = [[:blank:]]
  • \S = [^[:space:]]
  • \w = [[:alnum:]_]
  • \W = [^[:alnum:]_]
  • \d = [[:digit:]] (or [0-9])
  • \D = [^[:digit:]] (or [^0-9])
  • \h = [[:blank:]]
  • \S = [^[:space:]]

这篇关于类似Perl的速记字符类在括号表达式中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆