的Tarjan的强连接组件在Python算法不工作 [英] Tarjan's strongly connected components algorithm in python not working

查看:153
本文介绍了的Tarjan的强连接组件在Python算法不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实现了tarjan算法,根据<一href="http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm#The_algorithm_in_pseudo$c$c"相对=nofollow>维基,在Python,但它不工作。该算法是很短,我找不到任何区别,所以我不知道为什么它不工作。我试图检查原始文件,但无法找到它。

I implemented the Tarjan's strongly connected components algorithm, according to wikipedia, in Python, but it isn't working. The algorithm is quite short and I cannot find any difference, so I cannot tell why it isn't working. I tried to check the original paper, but could not find it.

下面是code。

def strongConnect(v):
  global E, idx, CCs, c, S
  idx[v] = (c, c) #idx[v][0] for v.index # idx[v][1] for v.lowlink
  c += 1
  S.append(v)  
  for w in [u for (v2, u) in E if v == v2]:
    if idx[w][0] < 0:
      strongConnect(w)
      # idx[w] = (idx[w][0], min(idx[v][1], idx[w][1])) #fixed, thx
      idx[v] = (idx[v][0], min(idx[v][1], idx[w][1]))
    elif w in S:
      idx[v] = (idx[v][0], min(idx[v][1], idx[w][0]))
  if (idx[v][0] == idx[v][1]):
    i = S.index(v)
    CCs.append(S[i:])
    S = S[:i]

E = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'), ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'), ('D', 'F'), ('F', 'B'), ('E', 'F')]
idx = {}
CCs = []
c = 0
S = []
for (u, v) in E:
  idx[u] = (-1, -1)
  idx[v] = (-1, -1)
for v in idx.keys():
  if idx[v][0] < 0:
    strongConnect(v)

print(CCs)

您可以<一href="http://desmond.imageshack.us/Himg687/scaled.php?server=687&filename=exampleu.png&res=medium"相对=nofollow>视觉如果preFER检查图。正如你可以看到这是伪code在维基百科相当向前平移。然而,这是输出:

You can check the graph visually if you prefer. As you can see this is a quite forward translation of the pseudocode in wikipedia. However, this is the output:

[['D', 'E', 'F'], ['B', 'C'], ['A']]

目前应该只有一个强连通分量,而不是三个。我希望这个问题是对的各个方面,如果不是我很抱歉。在任何情况下,我非常感谢你。

There should be only one strongly connected component, not three. I hope the question is right in all its aspects, if not I'm sorry. In any case, thank you very much.

推荐答案

好吧,我有更多的时间去思考这个问题。我不再肯定过滤的边缘是问题,因为我previously说。其实,我觉得有一个在<一个歧义href="http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm#The_algorithm_in_pseudo$c$c">pseudo$c$c;确实每个(V,W)于E 意味着的的每个边缘的(作为的字面意思为每建议),或仅每个边缘开始,用 v ,(因为你合理的假设)?然后,在for循环,是 v 在问题的最后 v 循环,因为这将是在Python?或者这是否回去做原来的 v ?伪code没有在这种情况下,明确范围界定的行为! (这将是非常奇怪的,如果 v 末分别成为最后,随心所欲,价值 v 从循环。这表明,筛选是正确的,因为在这种情况下, v 通过意味着同样的事情,所有的方式。)

Ok, I had some more time to think about this. I'm no longer certain that filtering the edges was the problem, as I previously stated. In fact, I think there's an ambiguity in the pseudocode; does for each (v, w) in E mean for each edge (as the literal meaning of for each suggests), or only each edge beginning with v, (as you reasonably assumed)? Then, after the for loop, is the v in question the final v from the for loop, as it would be in Python? Or does that go back to being the original v? Pseudocode doesn't have clearly defined scoping behavior in this case! (It would be really weird if the v at the end were to be the last, arbitrary, value of v from the loop. That suggests that filtering is correct, because in that case, v means the same thing all the way through.)

然而,在任何情况下,在明确的在code错误是在这里:

However, under any circumstances, the clear error in your code is here:

  idx[w] = (idx[w][0], min(idx[v][1], idx[w][1]))

据伪code,那绝对应该是

According to the pseudocode, that should definitely be

  idx[v] = (idx[v][0], min(idx[v][1], idx[w][1]))

在你做出改变,你会得到预期的结果。坦率地说不会,你犯了那个错误,因为你使用的是非常奇怪和反直觉的数据结构,让我感到吃惊。这是我觉得是一种进步 - 它仅增加了几行,我觉得它是更可读。

Once you make that change, you get the expected result. Frankly it doesn't surprise me that you made that mistake, because you're using a really weird and counterintuitive data structure. Here's what I think is an improvement -- it adds only a few more lines, and I find it to be much more readable.

import itertools

def strong_connect(vertex):
    global edges, indices, lowlinks, connected_components, index, stack
    indices[vertex] = index
    lowlinks[vertex] = index
    index += 1
    stack.append(vertex)

    for v, w in (e for e in edges if e[0] == vertex):
        if indices[w] < 0:
            strong_connect(w)
            lowlinks[v] = min(lowlinks[v], lowlinks[w])
        elif w in stack:
            lowlinks[v] = min(lowlinks[v], indices[w])

    if indices[vertex] == lowlinks[vertex]:
        connected_components.append([])
        while stack[-1] != vertex:
            connected_components[-1].append(stack.pop())
        connected_components[-1].append(stack.pop())

edges = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'), 
         ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'), 
         ('D', 'F'), ('F', 'B'), ('E', 'F')]
vertices = set(v for v in itertools.chain(*edges))
indices = dict((v, -1) for v in vertices)
lowlinks = indices.copy()
connected_components = []

index = 0
stack = []
for v in vertices:
    if indices[v] < 0:
        strong_connect(v)

print(connected_components)

不过,我发现使用全局变量这里没意思。你可以隐藏此走在自己的模块,但我preFER创建一个可调用类的想法。经过密切关注的Tarjan的<一个href="http://en.wikipedia.org/wiki/Talk%3aTarjan%27s_strongly_connected_components_algorithm#Status_of_the_algorithm">original伪code ,(这证实了过滤的版本是正确的,顺便说一下),我写了这个。它包括一个简单的类并执行几个基本测试:

However, I find the use of global variables here distasteful. You could hide this away in its own module, but I prefer the idea of creating a callable class. After looking more closely at Tarjan's original pseudocode, (which confirms that the "filtered" version is correct, by the way), I wrote this. It includes a simple Graph class and does couple of basic tests:

from itertools import chain
from collections import defaultdict

class Graph(object):
    def __init__(self, edges, vertices=()):
        edges = list(list(x) for x in edges)
        self.edges = edges
        self.vertices = set(chain(*edges)).union(vertices)
        self.tails = defaultdict(list)
        for head, tail in self.edges:
            self.tails[head].append(tail)

    @classmethod
    def from_dict(cls, edge_dict):
        return cls((k, v) for k, vs in edge_dict.iteritems() for v in vs)

class _StrongCC(object):
    def strong_connect(self, head):
        lowlink, count, stack = self.lowlink, self.count, self.stack
        lowlink[head] = count[head] = self.counter = self.counter + 1
        stack.append(head)

        for tail in self.graph.tails[head]:
            if tail not in count:
                self.strong_connect(tail)
                lowlink[head] = min(lowlink[head], lowlink[tail])
            elif count[tail] < count[head]:
                if tail in self.stack:
                    lowlink[head] = min(lowlink[head], count[tail])

        if lowlink[head] == count[head]:
            component = []
            while stack and count[stack[-1]] >= count[head]:
                component.append(stack.pop())
            self.connected_components.append(component)

    def __call__(self, graph):
        self.graph = graph
        self.counter = 0
        self.count = dict()
        self.lowlink = dict()
        self.stack = []
        self.connected_components = []

        for v in self.graph.vertices:
            if v not in self.count:
                self.strong_connect(v)

        return self.connected_components

strongly_connected_components = _StrongCC()

if __name__ == '__main__':
    edges = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'),
             ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'),
             ('D', 'F'), ('F', 'B'), ('E', 'F')]
    print strongly_connected_components(Graph(edges))
    edge_dict = {'a':['b', 'c', 'd'],
                 'b':['c', 'a'],
                 'c':['d', 'e'],
                 'd':['e'],
                 'e':['c']}
    print strongly_connected_components(Graph.from_dict(edge_dict))

这篇关于的Tarjan的强连接组件在Python算法不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆