在列表中查找数字集群 [英] Finding clusters of numbers in a list

查看:64
本文介绍了在列表中查找数字集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为此感到困惑,因为我确定十几个for循环不是解决此问题的方法:

I'm struggling with that, since I'm sure that a dozen for-loops is not the solution for this problem:

有一个类似的数字列表

numbers = [123, 124, 128, 160, 167, 213, 215, 230, 245, 255, 257, 400, 401, 402, 430]

,我想创建一个包含数字列表的字典,其中数字的差(紧随其后)不超过15.因此输出为:

and I want to create a dict with lists of numbers, wherein the difference of the numbers (following each other) is not more than 15. So the output would be this:

clusters = {
    1 : [123, 124, 128],
    2 : [160, 167],
    3 : [213, 215, 230, 245, 255, 257],
    4 : [400, 401, 402],
    5 : [430]
}

我当前的解决方案有点难看(我必须在最后删除重复项……),我敢肯定它可以用pythonic方式完成.

My current solution is a bit ugly (I have to remove duplicates at the end…), I'm sure it can be done in a pythonic way.

这就是我现在要做的:

clusters = {}  
dIndex = 0 
for i in range(len(numbers)-1) :
    if numbers[i+1] - numbers[i] <= 15 :
        if not clusters.has_key(dIndex) : clusters[dIndex] = []
        clusters[dIndex].append(numbers[i])
        clusters[dIndex].append(numbers[i+1])
    else : dIndex += 1

推荐答案

如果您的列表很小,则不是严格必要的,但是我可能会以流处理"的方式进行处理:定义一个生成器,使您的输入可迭代,并产生分组为数字的元素,这些元素之间的数字相差< =15.然后,您可以使用它轻松地生成字典.

Not strictly necessary if your list is small, but I'd probably approach this in a "stream-processing" fashion: define a generator that takes your input iterable, and yields the elements grouped into runs of numbers differing by <= 15. Then you can use that to generate your dictionary easily.

def grouper(iterable):
    prev = None
    group = []
    for item in iterable:
        if not prev or item - prev <= 15:
            group.append(item)
        else:
            yield group
            group = [item]
        prev = item
    if group:
        yield group

numbers = [123, 124, 128, 160, 167, 213, 215, 230, 245, 255, 257, 400, 401, 402, 430]
dict(enumerate(grouper(numbers), 1))

打印:

{1: [123, 124, 128],
 2: [160, 167],
 3: [213, 215, 230, 245, 255, 257],
 4: [400, 401, 402],
 5: [430]}

作为奖励,它甚至使您可以对潜在无限列表的跑步进行分组(当然,只要对它们进行排序即可).您还可以将索引生成部分放在生成器本身中(而不是使用enumerate)作为次要增强功能.

As a bonus, this lets you even group your runs for potentially-infinite lists (as long as they're sorted, of course). You could also stick the index generation part into the generator itself (instead of using enumerate) as a minor enhancement.

这篇关于在列表中查找数字集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆