寻找重复出现的模式 [英] Finding the recurring pattern

查看:65
本文介绍了寻找重复出现的模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个具有重复模式的数字,即存在一串重复自己的数字,以便生成有问题的数字.例如,这样的数字可能是 1234123412341234,它是通过重复数字 1234 创建的.
我想做的是找到重复自身以创建数字的模式.因此,给定 1234123412341234,我想计算 1234(可能还有 4,以表明 1234 是重复的4次创建1234123412341234)

我知道我可以这样做:

def findPattern(num):数量 = str(数量)对于我在范围内(len(num)):patt = num[:i]如果 (len(num)/len(patt))%1:继续如果 pat*(len(num)//len(patt)):返回 patt, len(num)//len(patt)

然而,这似乎有点太hacky了.我想我可以使用 itertools.cycle 来比较两个循环的相等性,这并没有真正成功:

在[25]中:c1 = itertools.cycle(list(range(4)))在 [26] 中:c2 = itertools.cycle(list(range(4)))在 [27] 中:c1==c2出[27]:假

有没有更好的方法来计算这个?(我对正则表达式持开放态度,但我不知道如何在那里应用它,这就是我没有在尝试中包含它的原因)

编辑:

  1. 我不一定知道该数字具有重复模式,因此如果没有,我必须返回 None.
  2. 现在,我只关心检测完全由重复模式组成的数字/字符串.但是,稍后,我可能还会对查找以几个字符开头的模式感兴趣:

<块引用>

magic_function(78961234123412341234)

将返回 1234 作为模式,4 作为重复次数,4 作为输入中的第一个索引模式首先出现的地方

解决方案

(.+?)\1+

试试这个.抓住捕获.请参阅演示.

导入重新p = re.compile(ur'(.+?)\1+')test_str = u"1234123412341234"re.findall(p, test_str)

如果您希望正则表达式在 12341234123123 上失败,则添加锚点并标记 Multiline,这应该返回 None.

^(.+?)\1+$

请参阅演示.

Let's say I have a number with a recurring pattern, i.e. there exists a string of digits that repeat themselves in order to make the number in question. For example, such a number might be 1234123412341234, created by repeating the digits 1234.
What I would like to do, is find the pattern that repeats itself to create the number. Therefore, given 1234123412341234, I would like to compute 1234 (and maybe 4, to indicate that 1234 is repeated 4 times to create 1234123412341234)

I know that I could do this:

def findPattern(num):
    num = str(num)
    for i in range(len(num)):
        patt = num[:i]
        if (len(num)/len(patt))%1:
            continue
        if pat*(len(num)//len(patt)):
            return patt, len(num)//len(patt)

However, this seems a little too hacky. I figured I could use itertools.cycle to compare two cycles for equality, which doesn't really pan out:

In [25]: c1 = itertools.cycle(list(range(4)))

In [26]: c2 = itertools.cycle(list(range(4)))

In [27]: c1==c2
Out[27]: False

Is there a better way to compute this? (I'd be open to a regex, but I have no idea how to apply it there, which is why I didn't include it in my attempts)

EDIT:

  1. I don't necessarily know that the number has a repeating pattern, so I have to return None if there isn't one.
  2. Right now, I'm only concerned with detecting numbers/strings that are made up entirely of a repeating pattern. However, later on, I'll likely also be interested in finding patterns that start after a few characters:

magic_function(78961234123412341234)

would return 1234 as the pattern, 4 as the number of times it is repeated, and 4 as the first index in the input where the pattern first presents itself

解决方案

(.+?)\1+

Try this. Grab the capture. See demo.

import re
p = re.compile(ur'(.+?)\1+')
test_str = u"1234123412341234"

re.findall(p, test_str)

Add anchors and flag Multiline if you want the regex to fail on 12341234123123, which should return None.

^(.+?)\1+$

See demo.

这篇关于寻找重复出现的模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆