Grep 的“无效范围结束"— 错误或功能? [英] Grep's "Invalid range end" ­— bug or feature?
问题描述
我有这三个文件:
$ cat pattern-ok
['-]
$ cat pattern-buggy
[-']
$ cat text
abc'def-ghi
现在,以下是我不知道的错误还是正则表达式功能?
And now, is the following a bug or a regexp feature I don't know?
$ cat text | grep -f pattern-ok
abc'def-ghi
$ cat text | grep -f pattern-buggy
grep: Invalid range end
我正在使用:
$ grep --version | head -n 1
grep (GNU grep) 2.20
推荐答案
这是因为你在其他字符中使用了连字符,以至于 grep
将其理解为一个范围,恰好是无效的.
This is because you are using the hyphen within other characters, so that grep
understands it as a range, which happens to be invalid.
你基本上在做
grep "[-']" file
这由 grep
解释,因为您提供要检查的字符范围,例如 grep "[a-z]" 文件
.但是 到
'
的范围是无效的,因此错误.
This is interpreted by grep
as you providing a range of characters to be checked on, like for example grep "[a-z]" file
. But the range from to
'
is invalid, hence the error.
为什么另一个在工作?你可能会问自己.因为你正在做的是:
And why the other one is working? You may be asking yourself. Because what you are doing is:
grep "['-]" file
在这种情况下,您要查找文件中的字符 '
、 或
-
.
In this case you are looking for either the character '
, or
-
in the file.
看另一个例子,我想在给定的字符串中查找字符 a
、-
或 3
:
See another example of it, where I want to find characters a
, -
or 3
in a given string:
$ echo "23-2" | grep -o '[a-3]'
grep: Invalid range end
$ echo "23-2" | grep -o '[a3-]'
3
-
$ echo "23-2" | grep -o '[a3-]'
3
-
所以潜在的问题是您在 []<中使用了表达式
some character
+ -
+ another character
/code> 块,它试图被读取为 some character
和 another character
之间的字符范围.
So the underlying problem is that you are using an expression some character
+ -
+ another character
within a []
block and it tries to be read as the range of characters between some character
and another character
.
如果要匹配字符 -
等,只需将其添加到表达式的边缘:作为第一项或最后一项.
If you want to match the character -
, among others, just add it in the edges of the expression: as the first or last item.
来自man grep
:
字符类和括号表达式
方括号表达式是由 [ 和 ] 括起来的字符列表.它匹配该列表中的任何单个字符;如果第一个字符列表中的字符是插入符号 ^ 然后它匹配任何不在其中的字符列表.例如,正则表达式 [0123456789] 匹配任何一位数.
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
在括号表达式中,范围表达式由两个由连字符分隔的字符.它匹配任何单个字符使用语言环境在两个字符之间进行排序,包括整理顺序和字符集.例如,在默认 C语言环境,[a-d] 等价于 [abcd].许多语言环境对字符进行排序按字典顺序,在这些语言环境中 [a-d] 通常是不等同于 [abcd];它可能等同于 [aBbCcDd],因为例子.获取括号的传统解释表达式,您可以通过设置 LC_ALL 来使用 C 语言环境环境变量的值 C.
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.
最后,某些命名的字符类在括号表达式,如下.他们的名字不言自明,它们是 [:alnum:], [:alpha:], [:cntrl:], [:digit:],[:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:],和 [:xdigit:].例如,[[:alnum:]] 表示字符类当前语言环境中的数字和字母.在 C 语言环境和 ASCII 中字符集编码,这与 [0-9A-Za-z] 相同.(笔记这些类名中的括号是符号的一部分名称,并且必须包含在用于分隔名称的括号之外括号表达式.)大多数元字符失去了它们的特殊含义括号内的表达式.要包含文字 ] 放置它列表中的第一个.同样,要包含文字 ^ 将其放置任何地方,但首先.最后,包含一个文字 - 放置它最后.
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means the character class of numbers and letters in the current locale. In the C locale and ASCII character set encoding, this is the same as [0-9A-Za-z]. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.
这篇关于Grep 的“无效范围结束"— 错误或功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!