积极向后看vs不自拍组:不同的行为 [英] Positive lookbehind vs non-capturing group: different behaviuor
问题描述
我在代码中使用python正则表达式( re
模块),并注意到在这些情况下的不同行为:
I use python regular expressions (re
module) in my code and noticed different behaviour in theese cases:
re.findall(r'\s*(?:[a-z]\))?[^.)]+', 'a) xyz. b) abc.') # non-capturing group
# results in ['a) xyz', ' b) abc']
和
re.findall(r'\s*(?<=[a-z]\))?[^.)]+', 'a) xyz. b) abc.') # lookbehind
# results in ['a', ' xyz', ' b', ' abc']
我需要得到的只是 ['xyz','abc']
。为什么示例的行为方式有所不同,以及如何获得预期的结果?
What I need to get is just ['xyz', 'abc']
. Why are the examples behave differently and how t get the desired result?
推荐答案
原因 a 第二种情况中包括了code>和
b
,这是因为(?< = [az] \ \))
首先会找到 a)
,由于环顾四周不会消耗任何字符,因此您在开始时返回现在 [^。)] +
匹配 a
The reason a
and b
are included in the second case is because (?<=[a-z]\))
would first find a)
and since lookaround's don't consume any character you are back at the start of string.Now [^.)]+
matches a
现在您在)
。由于您已经创建了(?< = [az] \))
可选的 [^。)] +
匹配 xyz
Now you are at )
.Since you have made (?<=[a-z]\))
optional [^.)]+
matches xyz
使用 b)abc
删除重复同样的事情?
从第二种情况下,您将得到预期的结果,即 ['xyz','abc']
remove ?
from the second case and you would get the expected result i.e ['xyz', 'abc']
这篇关于积极向后看vs不自拍组:不同的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!