与 grep 的非贪婪匹配 [英] Non-greedy matching with grep

查看:23
本文介绍了与 grep 的非贪婪匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,非贪婪匹配不是基本正则表达式 (BRE) 和扩展正则表达式 (ERE) 的一部分.然而,在不同版本的 grep(BSD 和 GNU)上的行为似乎暗示了其他明智的做法.

Non greedy matching as far as I know is not part of Basic Regular Expression (BRE) and Extended Regular Expression (ERE). However, the behaviour on different versions of grep (BSD and GNU) seems to suggest other wise.

例如,让我们以下面的例子为例.我有一个字符串说:

For example, let's take the following example. I have a string say:

string="hello_my_dear_polo"

使用 GNU grep:

以下是从字符串中提取 hello 的一些尝试.

BRE 尝试(失败):

$ grep -o "hel.*?o" <<< "$string"
hello_my_dear_polo

输出产生整个字符串,这表明非贪婪量词不适用于 BRE.请注意,我只对 ? 进行了转义,因为 * 不会失去它的意义并且不需要转义.

Output yields entire string which suggest the non-greedy quantifier does not work on BRE. Note that I have only escaped ? since * does not lose it's meaning and need not be escaped.

ERE 尝试(失败):

$ grep -oE "hel.*?o" <<< "$string"
hello_my_dear_polo

启用 -E 选项也会产生相同的输出,表明非贪婪匹配不是 ERE 的一部分.此处不需要转义,因为我们使用的是 ERE.

Enabling the -E option also yields the same output suggesting that non-greedy matching is not part of ERE. Escaping was not needed here since we are using ERE.

PCRE 尝试(成功):

$ grep -oP "hel.*?o" <<< "$string"
hello

为 PCRE 启用 -P 选项表明非贪婪量词是其中的一部分,因此我们得到了 hello 所需的输出.此处不需要转义,因为我们使用的是 PCRE.

Enabling the -P option for PCRE suggests that non-greedy quantifier is a part of it and hence we get the desired output of hello. Escaping was not needed here since we are using PCRE.

这里有一些尝试从字符串中提取 hello.

Here are few attempts to extract hello from the string.

BRE 尝试(失败):

$ grep -o "hel.*?o" <<< "$string"

使用 BRE 我没有从 BSD grep 得到任何输出.

Using BRE I get no output from BSD grep.

ERE 尝试(成功):

$ grep -oE "hel.*?o" <<< "$string"
hello

启用 -E 选项后,我惊讶能够提取我想要的输出.我的问题是关于我从这次尝试中得到的输出.

After enabling the -E option, I am surprised that I was able to extract my desired output. My question is on the output I am getting from this attempt.

PCRE 尝试(失败):

$ grep -oP "hel.*?o" <<< "$string"
usage: grep [-abcDEFGHhIiJLlmnOoPqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
    [-e pattern] [-f file] [--binary-files=value] [--color=when]
    [--context[=num]] [--directories=action] [--label] [--line-buffered]
    [--null] [pattern] [file ...]

使用 -P 选项给了我使用错误,这是预期的,因为 grep 的 BSD 选项不支持 PCRE.

Using -P option gave me usage error which was expected since BSD option of grep does not support PCRE.

所以我的问题是为什么在 BSD grep 上使用 ERE 会产生正确的输出,但使用非贪婪量词而不是 GNU grep.

So my question is why would using ERE on BSD grep yield correct output with using non-greedy quantifier but not with GNU grep.

这是一个错误,是 BSD egrep 的未记录功能还是我对输出的误解?

Is this a bug, an un-documented feature of BSD egrep or my mis-understanding of the output?

推荐答案

双量词只是一个语法错误,可能导致错误消息或未定义的行为.如果您收到错误消息,那可能会更好.

The double quantifier is simply a syntax error and could result in either an error message or undefined behavior. It would arguably be better if you got an error message.

Perl 对正则表达式后期 POSIX 的大量扩展;在编写这些工具时,极不可能有人尝试将这种古怪的语法用于任何事情.贪婪匹配仅在 1990 年代中期在 Perl 5 中引入.

Perl extensions to regex post-date POSIX by a large margin; at the time these tools were written, it was extremely unlikely that someone would try to use this wacky syntax for anything. Greedy matching was only introduced in Perl 5, in the mid-1990s.

这篇关于与 grep 的非贪婪匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆