一组工会查找算法 [英] A set union find algorithm

查看:118
本文介绍了一组工会查找算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

予有数以千计的1至100号的行,每行中定义的一组号码以及它们之间的关系。 我需要获得相关数字的集合。

I have thousands of lines of 1 to 100 numbers, every line define a group of numbers and a relationship among them. I need to get the sets of related numbers.

小例子: 如果我有这7行数据

Little Example: If I have this 7 lines of data

T1 T2
T3 
T4
T5
T6 T1
T5 T4
T3 T4 T7

我需要一个不那么慢的算法要知道,这里的集是:

I need a not so slow algorithm to know that the sets here are:

T1 T2 T6 (because T1 is related with T2 in the first line and T1 related with T6 in the line 5)
T3 T4 T5 T7 (because T5 is with T4 in line 6 and T3 is with T4 and T7 in line 7)

但是当你有非常大套是极为缓慢做搜索一个T(x)的每一个大组,并做套工会...等。

but when you have very big sets is painfully slow to do a search of a T(x) in every big set, and do unions of sets... etc.

你有一丝为此在一个不那么暴力的方式?

Do you have a hint to do this in a not so brute force manner?

我试图做到这一点在Python。

I'm trying to do this in Python.

推荐答案

一旦你已经建立了数据结构,正是查询你要反对它运行?告诉我们你现有的code。什么是T(x)的?请您谈一下一组数字,但是你的样品数据显示T1,T2等;请解释一下。

Once you have built the data structure, exactly what queries do you want to run against it? Show us your existing code. What is a T(x)? You talk about "groups of numbers" but your sample data shows T1, T2, etc; please explain.

你读过这样的:<一href="http://en.wikipedia.org/wiki/Disjoint-set_data_structure">http://en.wikipedia.org/wiki/Disjoint-set_data_structure

尝试寻找这个Python实现:<一href="http://$c$c.activestate.com/recipes/215912-union-find-data-structure/">http://$c$c.activestate.com/recipes/215912-union-find-data-structure/

Try looking at this Python implementation: http://code.activestate.com/recipes/215912-union-find-data-structure/

或者你可以抨击了一些相当简单易懂自己,如:

OR you can lash up something rather simple and understandable yourself, e.g.

[更新:全新code]

[Update: totally new code]

class DisjointSet(object):

    def __init__(self):
        self.leader = {} # maps a member to the group's leader
        self.group = {} # maps a group leader to the group (which is a set)

    def add(self, a, b):
        leadera = self.leader.get(a)
        leaderb = self.leader.get(b)
        if leadera is not None:
            if leaderb is not None:
                if leadera == leaderb: return # nothing to do
                groupa = self.group[leadera]
                groupb = self.group[leaderb]
                if len(groupa) < len(groupb):
                    a, leadera, groupa, b, leaderb, groupb = b, leaderb, groupb, a, leadera, groupa
                groupa |= groupb
                del self.group[leaderb]
                for k in groupb:
                    self.leader[k] = leadera
            else:
                self.group[leadera].add(b)
                self.leader[b] = leadera
        else:
            if leaderb is not None:
                self.group[leaderb].add(a)
                self.leader[a] = leaderb
            else:
                self.leader[a] = self.leader[b] = a
                self.group[a] = set([a, b])

data = """T1 T2
T3 T4
T5 T1
T3 T6
T7 T8
T3 T7
T9 TA
T1 T9"""
# data is chosen to demonstrate each of 5 paths in the code
from pprint import pprint as pp
ds = DisjointSet()
for line in data.splitlines():
    x, y = line.split()
    ds.add(x, y)
    print
    print x, y
    pp(ds.leader)
    pp(ds.group)

和这里是最后一步的输出:

and here is the output from the last step:

T1 T9
{'T1': 'T1',
 'T2': 'T1',
 'T3': 'T3',
 'T4': 'T3',
 'T5': 'T1',
 'T6': 'T3',
 'T7': 'T3',
 'T8': 'T3',
 'T9': 'T1',
 'TA': 'T1'}
{'T1': set(['T1', 'T2', 'T5', 'T9', 'TA']),
 'T3': set(['T3', 'T4', 'T6', 'T7', 'T8'])}

这篇关于一组工会查找算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆