获取正则表达式的所有可能匹配项(在 python 中)? [英] Get all possible matches for regex (in python)?

查看:39
本文介绍了获取正则表达式的所有可能匹配项(在 python 中)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个正则表达式,它可以以多种可能重叠的方式匹配一个字符串.但是,它似乎只捕获字符串中的一个可能匹配项,我怎样才能获得所有可能的匹配项?我试过 finditer 没有成功,但也许我用错了.

我试图解析的字符串是:

foo-foobar-foobaz

我使用的正则表达式是:

(.*)-(.*)>>>s = "foo-foobar-foobaz">>>匹配 = re.finditer(r'(.*)-(.*)', s)>>>[match.group(1) 用于匹配匹配]['foo-foobar']

我想要匹配(foo 和 foobar-foobaz),但它似乎只能得到(foo-foobar 和 foobaz).

解决方案

没问题:

<预><代码>>>>正则表达式 = "([^-]*-)(?=([^-]*))">>>对于 re.finditer(regex, "foo-foobar-foobaz") 结果:>>>打印("".join(result.groups()))foo-foobarfoob​​ar-foobaz

通过将第二个捕获括号放在先行断言中,您可以捕获其内容没有在整体比赛中消耗它.

我还使用了 [^-]* 而不是 .* 因为点也匹配分隔符 - 你可能不不想.

I have a regex that can match a string in multiple overlapping possible ways. However, it seems to only capture one possible match in the string, how can I get all possible matches? I've tried finditer with no success, but maybe I'm using it wrong.

The string I'm trying to parse is:

foo-foobar-foobaz

The regex I'm using is:

(.*)-(.*)

>>> s = "foo-foobar-foobaz"
>>> matches = re.finditer(r'(.*)-(.*)', s)
>>> [match.group(1) for match in matches]
['foo-foobar']

I want the match (foo and foobar-foobaz), but it seems to only get (foo-foobar and foobaz).

解决方案

No problem:

>>> regex = "([^-]*-)(?=([^-]*))"
>>> for result in re.finditer(regex, "foo-foobar-foobaz"):
>>>     print("".join(result.groups()))
foo-foobar
foobar-foobaz

By putting the second capturing parenthesis in a lookahead assertion, you can capture its contents without consuming it in the overall match.

I've also used [^-]* instead of .* because the dot also matches the separator - which you probably don't want.

这篇关于获取正则表达式的所有可能匹配项(在 python 中)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆