我想基于相似的属性对元组进行分组 [英] I want to group tuples based on similar attributes

查看:77
本文介绍了我想基于相似的属性对元组进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个元组列表. [(1,2),(2,3),(4,3),(5,6),(6,7),(8,2)]

I have a list of tuples. [ (1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2) ]

我想根据连接的元组将它们分组为列表(具有相关值).

I want to group them into lists based on which tuples are connected (have related values).

因此最终结果是相关元组值的两个列表= [[1、2、3、4、8],[5、6、7]]

So the end result is two lists of related tuple values = [ [1, 2, 3, 4, 8], [5, 6, 7] ]

我该如何编写一个函数来做到这一点?这是工作面试的问题.我试图用Python来做,但是我很沮丧,只是想看看答案背后的逻辑,所以即使是伪代码也可以帮助我,所以我可以看看我做错了什么.

How can I write a function to do this? This was a job interview question. I was trying to do it in Python, but I'm frustrated and just want to see the logic behind the answer, so even psuedo code would help me, so I can see what I did wrong.

我只有几分钟的时间进行现场操作,但这是我尝试的方法:

I only had a few minutes to do this on the spot, but here is what I tried:

def find_partitions(connections):
 theBigList = []     # List of Lists
 list1 = []          # The initial list to store lists
 theBigList.append(list1)

 for list in theBigList:
 list.append(connection[1[0], 1[1]])
     for i in connections:
         if i[0] in list or i[1] in list:
             list.append(i[0], i[1])

         else:
             newList = []
             theBigList.append(newList)

本质上,这个家伙想要一个相关值列表的列表. 我尝试使用for循环,但是意识到它不起作用,然后时间用完了.

Essentially, the guy wanted an list of lists of values that were related. I tried to use a for loop, but realized that it wouldn't work, and then time ran out.

推荐答案

在我们填写组件时,每个阶段都需要考虑三种情况(因为您必须匹配重叠的组):

As we fill in the components, at each stage there are three cases to consider (as you will have to match up overlapping groups):

  1. x或y都不在已找到的任何分量中.
  2. 两者都已经在不同的集合中,x在set_i中,y在set_j中.
  3. 一个或两个都在一个组件中,x在set_i中或y在set_i中.
  1. Neither x or y are in any component already found.
  2. Both are already in different sets, x in set_i and y in set_j.
  3. Either one or both are in one component, x in set_i or y in a set_i.

我们可以使用内置的set来提供帮助. (请参阅@jwpat和@DSM的棘手示例):

We can use the built-in set to help. (see @jwpat's and @DSM's trickier examples):

def connected_components(lst):
    components = [] # list of sets
    for (x,y) in lst:
        i = j = set_i = set_j = None
        for k, c in enumerate(components):
            if x in c:
                i, set_i = k, c
            if y in c:
                j, set_j = k, c

        #case1 (or already in same set)
        if i == j:
             if i == None:
                 components.append(set([x,y]))
             continue

        #case2
        if i != None and j != None:
            components = [components[k] for k in range(len(components)) if k!=i and k!=j]
            components.append(set_i | set_j)
            continue

        #case3
        if j != None:
            components[j].add(x)
        if i != None:
            components[i].add(y)

    return components               

lst = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
connected_components(lst)
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])]
map(list, connected_components(lst))
# [[8, 1, 2, 3, 4], [5, 6, 7]]

connected_components([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])] # @jwpat's example

connected_components([[1, 3], [2, 4], [3, 4]]
# [set([1, 2, 3, 4])] # @DSM's example

这当然不是最有效的方法,但可能与他们期望的类似. 正如乔恩·克莱门茨(Jon Clements)所指出的,有一个用于这些类型的计算的库: networkx ,其中他们会更有效率.

This certainly won't be the most efficient method, but is perhaps one similar to what they would expect. As Jon Clements points out there is a library for these type of calculations: networkx, where they will be much more efficent.

这篇关于我想基于相似的属性对元组进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆