Grep 的“无效范围结束"— 错误或功能? [英] Grep's "Invalid range end" ­— bug or feature?

查看:33
本文介绍了Grep 的“无效范围结束"— 错误或功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这三个文件:

$ cat pattern-ok 
['-]
$ cat pattern-buggy 
[-']
$ cat text 
abc'def-ghi

现在,以下是我不知道的错误还是正则表达式功能?

And now, is the following a bug or a regexp feature I don't know?

$ cat text | grep -f pattern-ok 
abc'def-ghi
$ cat text | grep -f pattern-buggy
grep: Invalid range end

我正在使用:

$ grep --version | head -n 1
grep (GNU grep) 2.20

推荐答案

这是因为你在其他字符中使用了连字符,以至于 grep 将其理解为一个范围,恰好是无效的.

This is because you are using the hyphen within other characters, so that grep understands it as a range, which happens to be invalid.

你基本上在做

grep "[-']" file

这由 grep 解释,因为您提供要检查的字符范围,例如 grep "[a-z]" 文件.但是 ' 的范围是无效的,因此错误.

This is interpreted by grep as you providing a range of characters to be checked on, like for example grep "[a-z]" file. But the range from to ' is invalid, hence the error.

为什么另一个在工作?你可能会问自己.因为你正在做的是:

And why the other one is working? You may be asking yourself. Because what you are doing is:

grep "['-]" file

在这种情况下,您要查找文件中的字符 '-.

In this case you are looking for either the character ', or - in the file.

看另一个例子,我想在给定的字符串中查找字符 a-3:

See another example of it, where I want to find characters a, - or 3 in a given string:

$ echo "23-2" | grep -o '[a-3]'
grep: Invalid range end
$ echo "23-2" | grep -o '[a3-]'
3
-
$ echo "23-2" | grep -o '[a3-]'
3
-

所以潜在的问题是您在 []<中使用了表达式 some character + - + another character/code> 块,它试图被读取为 some characteranother character 之间的字符范围.

So the underlying problem is that you are using an expression some character + - + another character within a [] block and it tries to be read as the range of characters between some character and another character.

如果要匹配字符 - 等,只需将其添加到表达式的边缘:作为第一项或最后一项.

If you want to match the character -, among others, just add it in the edges of the expression: as the first or last item.

来自man grep:

字符类和括号表达式

方括号表达式是由 [ 和 ] 括起来的字符列表.它匹配该列表中的任何单个字符;如果第一个字符列表中的字符是插入符号 ^ 然后它匹配任何不在其中的字符列表.例如,正则表达式 [0123456789] 匹配任何一位数.

A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.

在括号表达式中,范围表达式由两个由连字符分隔的字符.它匹配任何单个字符使用语言环境在两个字符之间进行排序,包括整理顺序和字符集.例如,在默认 C语言环境,[a-d] 等价于 [abcd].许多语言环境对字符进行排序按字典顺序,在这些语言环境中 [a-d] 通常是不等同于 [abcd];它可能等同于 [aBbCcDd],因为例子.获取括号的传统解释表达式,您可以通过设置 LC_ALL 来使用 C 语言环境环境变量的值 C.

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.

最后,某些命名的字符类在括号表达式,如下.他们的名字不言自明,它们是 [:alnum:], [:alpha:], [:cntrl:], [:digit:],[:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:],和 [:xdigit:].例如,[[:alnum:]] 表示字符类当前语言环境中的数字和字母.在 C 语言环境和 ASCII 中字符集编码,这与 [0-9A-Za-z] 相同.(笔记这些类名中的括号是符号的一部分名称,并且必须包含在用于分隔名称的括号之外括号表达式.)大多数元字符失去了它们的特殊含义括号内的表达式.要包含文字 ] 放置它列表中的第一个.同样,要包含文字 ^ 将其放置任何地方,但首先.最后,包含一个文字 - 放置它最后.

Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means the character class of numbers and letters in the current locale. In the C locale and ASCII character set encoding, this is the same as [0-9A-Za-z]. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.

这篇关于Grep 的“无效范围结束"— 错误或功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆