有没有一种简单的方法来传递“原始"消息?字符串到grep? [英] Is there an easy way to pass a "raw" string to grep?
问题描述
grep
时,不能将它们输入原始"字符串,因为某些字符需要转义以免被视为文字.例如:
grep
can't be fed "raw" strings when used from the command-line, since some characters need to be escaped to not be treated as literals. For example:
$ grep '(hello|bye)' # WON'T MATCH 'hello'
$ grep '\(hello\|bye\)' # GOOD, BUT QUICKLY BECOMES UNREADABLE
我正在使用printf
自动转义字符串:
I was using printf
to auto-escape strings:
$ printf '%q' '(some|group)\n'
\(some\|group\)\\n
这将生成字符串的以bash格式转义的版本,并使用反引号将其轻松传递给grep调用:
This produces a bash-escaped version of the string, and using backticks, this can easily be passed to a grep call:
$ grep `printf '%q' '(a|b|c)'`
但是,这显然不是要这样做的:输出中的某些字符未转义,而某些字符不必要地转义.例如:
However, it's clearly not meant for this: some characters in the output are not escaped, and some are unnecessarily so. For example:
$ printf '%q' '(^#)'
\(\^#\)
^
字符在传递给grep
时不应转义.
The ^
character should not be escaped when passed to grep
.
是否存在一个cli工具,该工具可以接收原始字符串并返回可以以grep形式直接用作模式的字符串的bash转义版本?如果不是,我怎么能用纯bash做到这一点?
Is there a cli tool that takes a raw string and returns a bash-escaped version of the string that can be directly used as pattern with grep? How can I achieve this in pure bash, if not?
推荐答案
如果您试图让grep
使用扩展正则表达式语法,则该方法是使用grep -E
(又名egrep
) .您还应该了解grep -F
(又名fgrep
),以及在较新版本的GNU Coreutils中的grep -P
.
If you are attempting to get grep
to use Extended Regular Expression syntax, the way to do that is to use grep -E
(aka egrep
). You should also know about grep -F
(aka fgrep
) and, in newer versions of GNU Coreutils, grep -P
.
背景:原始的grep
具有相当少的一组正则表达式运算符.这是肯·汤普森(Ken Thompson)最初的正则表达式实现.后来开发了具有扩展曲目的新版本,并且出于兼容性原因,使用了不同的名称.对于GNU grep
,只有一个二进制文件,如果以grep
调用,则可以理解传统的基本RE语法,而如果以egrep
调用,则可以理解为ERE.通过使用反斜杠转义符引入特殊含义,可以在grep
中使用egrep
中的某些构造.
Background: The original grep
had a fairly small set of regex operators; it was Ken Thompson's original regular expression implementation. A new version with an extended repertoire was developed later, and for compatibility reasons, got a different name. With GNU grep
, there is only one binary, which understands the traditional, basic RE syntax if invoked as grep
, and ERE if invoked as egrep
. Some constructs from egrep
are available in grep
by using a backslash escape to introduce special meaning.
随后,Perl编程语言进一步扩展了形式主义.这个正则表达式方言似乎也是大多数新手错误地期望grep
支持的语言.使用grep -P
可以;但这尚未在所有平台上得到广泛支持.
Subsequently, the Perl programming language has extended the formalism even further; this regex dialect seems to be what most newcomers erroneously expect grep
, too, to support. With grep -P
, it does; but this is not yet widely supported on all platforms.
因此,在grep
中,以下字符具有特殊含义:^$[]*.\
So, in grep
, the following characters have a special meaning: ^$[]*.\
在egrep
中,以下字符也具有特殊含义:()|+?{}
. (重复括号不在原始的egrep
中.)分组括号还启用了\1
,\2
等的反向引用.
In egrep
, the following characters also have a special meaning: ()|+?{}
. (The braces for repetition were not in the original egrep
.) The grouping parentheses also enable backreferences with \1
, \2
, etc.
在grep
的许多版本中,可以通过在egrep
特殊字符前加反斜杠来获得egrep
行为.还有一些特殊的序列,例如\<\>
.
In many versions of grep
, you can get the egrep
behavior by putting a backslash before the egrep
specials. There are also special sequences like \<\>
.
在Perl中,引入了许多其他的转义字符,例如\w
\s
\d
.在Perl 5中,正则表达式功能得到了实质性的扩展,具有非贪婪匹配*?
+?
等,非分组括号(?:...)
,超前,超前等等.
In Perl, a huge number of additional escapes like \w
\s
\d
were introduced. In Perl 5, the regex facility was substantially extended, with non-greedy matching *?
+?
etc, non-grouping parentheses (?:...)
, lookaheads, lookbehinds, etc.
...话虽如此,如果您确实希望在不调用任何外部进程的情况下将egrep
正则表达式转换为grep
正则表达式 ,请为每个egrep
特殊字符;但是请注意,这不能正确处理字符类,否定字符类或反斜杠转义符.
... Having said that, if you really do want to convert egrep
regular expressions to grep
regular expressions without invoking any external process, try ${regex/pattern/substitution}
for each of the egrep
special characters; but recognize that this does not handle character classes, negated character classes, or backslash escapes correctly.
这篇关于有没有一种简单的方法来传递“原始"消息?字符串到grep?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!