Grep歧义嵌套方括号 [英] Grep ambiguity nested square bracket

查看:97
本文介绍了Grep歧义嵌套方括号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

sample.txt contains

  abcde 
abde

任何人都可以解释下列命令的输出 -


  1. grep'[[ab]]'sample.txt - 无输出
  2. - 无输出

  3. grep'[ab []'sample .txt - 输出为 abcde abde

  4. grep'[ab]]'sample.txt - no output

[(ab)] [^(ab)] 是什么意思?它与 [ab] [^ ab]

* 将匹配 * ,并不意味着 0或1 重复。同样,()会匹配,将不会创建捕获组



现在,如果] 在一个字符类中找到,它会自动关闭字符类,而另一个字符不会是该字符类的一部分。现在,让我们了解上面发生的事情:




1 2 4 ,你的角色类在第一个结尾] 。因此,最后一个括号 - ] 不是字符类的一部分。它必须分开配对。所以,你的模式将匹配这样的东西:

 '[[ab]]'与'([| a | b)(])'//最后一个`]`必须匹配。 
'[ab []]'与'(a | b | [)(])'相同//最后一个`]`必须匹配。
'[ab]]'与'(a | b |])(')''//相同,最后一个`]`必须匹配。
^
^ ----字符类在这里关闭。

现在,因为在这两个字符串中都没有]

然而,在第三种模式中,您的角色类只能由最后一个 。因此所有东西都进入了字符类。



pre $ '[ab []'表示包含'a'的匹配字符串, b'或'['

这是完全有效的并且与字符串匹配。







[(ab)]
$ [^ > [(ab)] 表示匹配任何 a b 。请记住,在字符类中,正则表达式的元字符没有任何特殊含义。所以,你不能在角色类中创建组。


[^(ab)] 表示 [(ab)] 。它匹配任何不包含任何指定字符的字符串。







它是与 [ab] [^ ab]


$相同b
$ b

没有。这两个不包含。因此他们有点不同。


sample.txt contains

abcde
abde

Can anybody explain the output of following commands -

  1. grep '[[ab]]' sample.txt - no output
  2. grep '[ab[]]' sample.txt - no output
  3. grep '[ab[]' sample.txt - output is abcde , abde
  4. grep '[ab]]' sample.txt - no output

And what does [(ab)] and [^(ab)] mean? Is it the same as [ab] and [^ab] ?

解决方案

First thing to understand is, inside a character class, none of the meta-characters of regex has any special meaning. They are matched literally. For e.g., an * will match a * and will not mean 0 or 1 repetition. Similarly, () will match ( and ), and will not create a capture group.

Now, if a ] is found in a character class, that automatically closes the character class, and the further character won't be the part of that character class. Now, let's understand what is happening above:


In 1, 2, and 4, your character class ends at the first closing ]. So, the last closing bracket - ], is not the part of character class. It has to be matched separately. So, your pattern will match something like this:

'[[ab]]' is same as '([|a|b)(])'  // The last `]` has to match.
'[ab[]]' is same as '(a|b|[)(])'  // Again, the last `]` has to match.
'[ab]]'  is same as '(a|b|])(])'  // Same, the last `]` has to match.
    ^
    ^---- Character class closes here.

Now, since in both the string, there is no ] at the end, hence no match is found.

Whereas, in the 3rd pattern, your character class is closed only by the last ]. And hence everything comes inside the character class.

'[ab[]' means match string that contains 'a', or 'b', or '['

which is perfectly valid and match both the string.


And what does [(ab)] and [^(ab)] mean?

[(ab)] means match any of the (, a, b, ). Remember, inside a character class, no meta-character of regex has any special meaning. So, you can't create groups inside a character class.

[^(ab)] means exact opposite of [(ab)]. It matches any string which does not contain any of those characters specified.


Is it the same as [ab] and [^ab] ?

No. These two does not include ( and ). Hence they are little different.

这篇关于Grep歧义嵌套方括号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆