在Python中分组/聚类数字 [英] Grouping / clustering numbers in Python

查看:336
本文介绍了在Python中分组/聚类数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用谷歌搜索过,我已经测试过了,这让我的智慧结束了。我有一个我需要按相似性分组的数字列表。例如,在[1,6,9,10,110,105,109,134,139]的列表中,1 6 9将被放入列表中,100,102,105和109将被放入列表中列表,134和139.我在数学方面很糟糕,我已经尝试过并试过这个,但我无法让它工作。为了尽可能明确,我希望将10个值之间的数字组合在一起。有人可以帮忙吗?谢谢。

I've googled, I've tested, and this has me at my wits end. I have a list of numbers I need to group by similarity. For instance, in a list of [1, 6, 9, 100, 102, 105, 109, 134, 139], 1 6 9 would be put into a list, 100, 102, 105, and 109 would be put into a list, and 134 and 139. I'm terrible at math, and I've tried and tried this, but I can't get it to work. To be explicit as possible, I wish to group numbers that are within 10 values away from one another. Can anyone help? Thanks.

推荐答案

有很多方法可以做到聚类分析。一种简单的方法是查看连续数据元素之间的间隙大小:

There are many ways to do cluster analysis. One simple approach is to look at the gap size between successive data elements:

def cluster(data, maxgap):
    '''Arrange data into groups where successive elements
       differ by no more than *maxgap*

        >>> cluster([1, 6, 9, 100, 102, 105, 109, 134, 139], maxgap=10)
        [[1, 6, 9], [100, 102, 105, 109], [134, 139]]

        >>> cluster([1, 6, 9, 99, 100, 102, 105, 134, 139, 141], maxgap=10)
        [[1, 6, 9], [99, 100, 102, 105], [134, 139, 141]]

    '''
    data.sort()
    groups = [[data[0]]]
    for x in data[1:]:
        if abs(x - groups[-1][-1]) <= maxgap:
            groups[-1].append(x)
        else:
            groups.append([x])
    return groups

if __name__ == '__main__':
    import doctest
    print(doctest.testmod())

这篇关于在Python中分组/聚类数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆