查找所有出现的子字符串(包括重叠)? [英] Find all occurrences of a substring (including overlap)?

查看:26
本文介绍了查找所有出现的子字符串(包括重叠)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,所以我找到了这个:如何找到所有出现的一个子串?

Okay, so I found this: How to find all occurrences of a substring?

也就是说,要获取列表中子字符串重叠出现的索引,您可以使用:

Which says, to get the indices overlapping occurances of substrings in a list, you can use:

[m.start() for m in re.finditer('(?=SUBSTRING)', 'STRING')]

哪个有效,但我的问题是要查找的字符串和子字符串都是由变量定义的.我对正则表达式的了解不够,不知道如何处理它 - 我可以让它与不重叠的子字符串一起工作,这只是:

Which works, but my problem is that both the string and the substring to look for are defined by variables. I don't know enough about regular expressions to know how to deal with it - I can get it to work with non-overlapping substrings, that's just:

[m.start() for m in re.finditer(p3, p1)]

因为有人问,所以我会继续说明.p1 和 p3 可以是任何字符串,但如果它们是,例如 p3 = "tryt"p1 = "trytryt",结果应该是 [0, 3].

Because someone asked, I'll go ahead and specfify. p1 and p3 could be any string, but if they were, for example p3 = "tryt" and p1 = "trytryt", the result should be [0, 3].

推荐答案

re.finditer 是简单的字符串.如果变量中有子字符串,只需将其格式化为正则表达式即可.像 '(?={0})'.format(p3) 这样的东西是一个开始.由于 各种符号在 RE 中确实具有特殊含义,您会想逃避他们.幸运的是 re 模块 包括 re.escape 正是为了满足这种需求.>

The arguments to re.finditer are simple strings. If you have the substring in a variable simply format it into the regular expression. Something like '(?={0})'.format(p3) is a start. Since various symbols do have special meaning in a RE you will want to escape them. Luckily the re module includes re.escape for just such a need.

[m.start() for m in re.finditer('(?={0})'.format(re.escape(p3)), p1)]

这篇关于查找所有出现的子字符串(包括重叠)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆