环顾和非捕获组之间的功能差异? [英] functional difference between lookarounds and non-capture group?

查看:101
本文介绍了环顾和非捕获组之间的功能差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试举一个例子,其中积极的环顾四周是可行的,但是 非捕获组将无法正常工作,以进一步了解其用法.我也想出了与非捕获小组一起工作的所有示例,因此我觉得我没有完全掌握正面环顾的用法.

I'm trying to come up with an example where positive look-around works but non-capture groups won't work, to further understand their usages. The examples I"m coming up with all work with non-capture groups as well, so I feel like I"m not fully grasping the usage of positive look around.

这里是一个字符串(取自一个SO示例),在回答中使用肯定的前瞻性.用户只想获取第二列的值, 第一列以ABC开头,最后一列的值为有效".

Here is a string, (taken from a SO example) that uses positive look ahead in the answer. The user wanted to grab the second column value, only if the value of the first column started with ABC, and the last column had the value 'active'.

string ='''ABC1    1.1.1.1    20151118    active
          ABC2    2.2.2.2    20151118    inactive
          xxx     x.x.x.x    xxxxxxxx    active'''

给出的解决方案使用了积极的前瞻性",但我注意到我可以使用非捕获组来得出相同的答案. 因此,我很难提出一个示例,在该示例中,积极的环顾有效,非捕获性小组不起作用.

The solution given used 'positive look ahead' but I noticed that I could use non-caputure groups to arrive at the same answer. So, I'm having trouble coming up with an example where positive look-around works, non-capturing group doesn't work.

pattern =re.compile('ABC\w\s+(\S+)\s+(?=\S+\s+active)') #solution

pattern =re.compile('ABC\w\s+(\S+)\s+(?:\S+\s+active)') #solution w/out lookaround

如果有人愿意提供一个例子,我将不胜感激.

If anyone would be kind enough to provide an example, I would be grateful.

谢谢.

推荐答案

基本区别在于以下事实:非捕获组仍会消耗它们匹配的字符串部分,从而使光标向前移动.

The fundamental difference is the fact, that non-capturing groups still consume the part of the string they match, thus moving the cursor forward.

一个与之根本不同的示例是,当您尝试匹配某些字符串时,这些字符串被某些边界包围并且这些边界可以重叠.示例任务:

One example where this makes a fundamental difference is when you try to match certain strings, that are surrounded by certain boundaries and these boundaries can overlap. Sample task:

匹配给定字符串中所有被b包围的a-给定字符串为bababaca.应该在位置2和4进行两次比赛.

Match all as from a given string, that are surrounded by bs - the given string is bababaca. There should be two matches, at positions 2 and 4.

使用环视很简单,您可以使用b(a)(?=b)(?<=b)a(?=b)进行匹配.但是(?:b)a(?:b)将不起作用-第一个匹配项还将消耗位置3处的b,这是第二个匹配项的边界. (注意:这里实际上不需要非捕获组)

Using lookarounds this is rather easy, you can use b(a)(?=b) or (?<=b)a(?=b) and match them. But (?:b)a(?:b) won't work - the first match will also consume the b at position 3, that is needed as boundary for the second match. (note: the non-capturing group isn't actually needed here)

另一个比较突出的示例是密码验证-检查密码中是否包含大写,小写字母,数字等内容-您可以使用一堆替代字词来匹配它们-但先行方式会更容易:

Another rather prominent sample are password validations - check that the password contains uppercase, lowercase letters, numbers, whatever - you can use a bunch of alternations to match these - but lookaheads come in way easier:

(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!?.])

vs

(?:.*[a-z].*[A-Z].*[0-9].*[!?.])|(?:.*[A-Z][a-z].*[0-9].*[!?.])|(?:.*[0-9].*[a-z].*[A-Z].*[!?.])|(?:.*[!?.].*[a-z].*[A-Z].*[0-9])|(?:.*[A-Z][a-z].*[!?.].*[0-9])|...

这篇关于环顾和非捕获组之间的功能差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆