grep的贪婪行为 [英] Greedy behaviour of grep

查看:78
本文介绍了grep的贪婪行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为在正则表达式中,贪婪"适用于量词而不是整个匹配项.但是,我发现

I thought that in regular expressions, the "greediness" applies to quantifiers rather than matches as a whole. However, I observe that

grep -E --color=auto 'a+(ab)?' <(printf "aab")

返回 aab 而不是 aa b.

sed也是如此. 另一方面,在pcregrep和其他工具中,贪婪的确实是量词. 这是grep的特定行为吗?

The same applies to sed. On the other hand, in pcregrep and other tools, it is really the quantifier that is greedy. Is this a specific behaviour of grep?

我都检查了 grep(BSD grep)2.5.1-FreeBSD和grep(GNU grep)3.1

N.B. I checked both grep (BSD grep) 2.5.1-FreeBSD and grep (GNU grep) 3.1

推荐答案

In the description of term matched, POSIX states that

对匹配序列的搜索从字符串的开头开始,并在找到与表达式匹配的第一个序列时停止,其中"first"表示匹配表达式的第一个序列.术语定义"是指最早在字符串中". 如果该模式允许可变数量的匹配字符,因此从该点开始有多个这样的序列,那么将匹配最长的此类序列.

此声明清楚地回答了您的问题.字符串aab包含两个从与ERE a+(ab)?相同的位置开始的子字符串.它们是aaaab.后者最长,因此匹配.

This statement clearly anwers your question. The string aab contains two substrings beginning at the same position matching the ERE a+(ab)?; these are aa and aab. The latter is the longest, thus it's matched.

这篇关于grep的贪婪行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆