Python正则表达式找到所有重叠的匹配项? [英] Python regex find all overlapping matches?

查看:77
本文介绍了Python正则表达式找到所有重叠的匹配项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Python 2.6 中的 re 在更大的数字系列中查找每 10 位数字系列.

I'm trying to find every 10 digit series of numbers within a larger series of numbers using re in Python 2.6.

我可以轻松获取不重叠的匹配项,但我想要数字系列中的每个匹配项.例如.

I'm easily able to grab no overlapping matches, but I want every match in the number series. Eg.

在123456789123456789"

in "123456789123456789"

我应该得到以下列表:

[1234567891,2345678912,3456789123,4567891234,5678912345,6789123456,7891234567,8912345678,9123456789]

我发现了对前瞻"的引用,但我看到的示例只显示了数字对而不是更大的分组,而且我无法将它们转换为超过两位数.

I've found references to a "lookahead", but the examples I've seen only show pairs of numbers rather than larger groupings and I haven't been able to convert them beyond the two digits.

推荐答案

在前瞻中使用捕获组.前瞻捕获您感兴趣的文本,但实际匹配在技术上是前瞻之前的零宽度子字符串,因此匹配在技术上是不重叠的:

Use a capturing group inside a lookahead. The lookahead captures the text you're interested in, but the actual match is technically the zero-width substring before the lookahead, so the matches are technically non-overlapping:

import re 
s = "123456789123456789"
matches = re.finditer(r'(?=(\d{10}))',s)
results = [int(match.group(1)) for match in matches]
# results: 
# [1234567891,
#  2345678912,
#  3456789123,
#  4567891234,
#  5678912345,
#  6789123456,
#  7891234567,
#  8912345678,
#  9123456789]

这篇关于Python正则表达式找到所有重叠的匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆