使用 LINQ 进行高效的图遍历 - 消除递归 [英] Efficient graph traversal with LINQ - eliminating recursion

查看:20
本文介绍了使用 LINQ 进行高效的图遍历 - 消除递归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

今天我要实现一种方法来遍历任意深度的图并将其展平为单个可枚举.相反,我先做了一些搜索,发现了这个:

Today I was going to implement a method to traverse an arbitrarily deep graph and flatten it into a single enumerable. Instead, I did a little searching first and found this:

public static IEnumerable<T> Traverse<T>(this IEnumerable<T> enumerable, Func<T, IEnumerable<T>> recursivePropertySelector)
{
    foreach (T item in enumerable)
    {
        yield return item;

        IEnumerable<T> seqRecurse = recursivePropertySelector(item);

        if (seqRecurse == null) continue;
        foreach (T itemRecurse in Traverse(seqRecurse, recursivePropertySelector))
        {
            yield return itemRecurse;
        }
    }
}

理论上这看起来不错,但在实践中我发现它的性能比使用等效的手写代码(根据情况出现)来浏览图形并执行任何需要完成的操作要差得多.我怀疑这是因为在这种方法中,对于它返回的每个项目,堆栈必须展开到某个任意深的级别.

In theory this looks good, but in practice I've found it performs significantly more poorly than using equivalent hand-written code (as the situation arises) to go through a graph and do whatever needs to be done. I suspect this is because in this method, for every item it returns, the stack has to unwind to some arbitrarily deep level.

我还怀疑如果消除递归,这种方法会更有效地运行.我也碰巧不太擅长消除递归.

I also suspect that this method would run a lot more efficiently if the recursion were eliminated. I also happen to not be very good at eliminating recursion.

有谁知道如何重写这个方法来消除递归?

Does anyone know how to rewrite this method to eliminate the recursion?

感谢您的帮助.

非常感谢所有详细的回复.我已经尝试对原始解决方案与 Eric 的解决方案进行基准测试,而不是使用枚举器方法,而是使用 lambda 递归遍历,奇怪的是,lambda 递归明显比其他两种方法都快.

Thanks very much for all the detailed responses. I've tried benchmarking the original solution vs Eric's solution vs not using an enumerator method and instead recursively traversing with a a lambda and oddly, the lambda recursion is significantly faster than either of the other two methods.

class Node
{
    public List<Node> ChildNodes { get; set; } 

    public Node()
    {
        ChildNodes = new List<Node>();
    }
}

class Foo
{
    public static void Main(String[] args) 
    {
        var nodes = new List<Node>();
        for(int i = 0; i < 100; i++)
        {
            var nodeA = new Node();
            nodes.Add(nodeA);
            for (int j = 0; j < 100; j++)
            {
                var nodeB = new Node();
                nodeA.ChildNodes.Add(nodeB);
                for (int k = 0; k < 100; k++)
                {
                    var nodeC = new Node();
                    nodeB.ChildNodes.Add(nodeC);
                    for(int l = 0; l < 12; l++)
                    {
                        var nodeD = new Node();
                        nodeC.ChildNodes.Add(nodeD);
                    }
                }
            }
        }            

        nodes.TraverseOld(node => node.ChildNodes).ToList();
        nodes.TraverseNew(node => node.ChildNodes).ToList();

        var watch = Stopwatch.StartNew();
        nodes.TraverseOld(node => node.ChildNodes).ToList();
        watch.Stop();
        var recursiveTraversalTime = watch.ElapsedMilliseconds;
        watch.Restart();
        nodes.TraverseNew(node => node.ChildNodes).ToList();
        watch.Stop();
        var noRecursionTraversalTime = watch.ElapsedMilliseconds;

        Action<Node> visitNode = null;
        visitNode = node =>
        {
            foreach (var child in node.ChildNodes)
                visitNode(child);
        };

        watch.Restart();
        foreach(var node in nodes)
            visitNode(node);
        watch.Stop();
        var lambdaRecursionTime = watch.ElapsedMilliseconds;
    }
}

其中 TraverseOld 是原始方法,TraverseNew 是 Eric 的方法,显然 lambda 是 lambda.

Where TraverseOld is the original method, TraverseNew is Eric's method, and obviously the lambda is the lambda.

在我的机器上,TraverseOld 需要 10127 毫秒,TraverseNew 需要 3038 毫秒,lambda 递归需要 1181 毫秒.

On my machine, TraverseOld takes 10127 ms, TraverseNew takes 3038 ms, the lambda recursion takes 1181 ms.

与立即执行相比,枚举器方法(带收益返回)可能需要 3 倍的时间,这是典型的吗?还是这里发生了其他事情?

Is this typical that enumerator methods (with yield return) can take 3X as long as opposed to immediate execution? Or is something else going on here?

推荐答案

首先,你是完全正确的;如果图有 n 个平均深度为 d 的节点,那么朴素的嵌套迭代器会产生一个时间为 O(n*d) 和堆栈中为 O(d) 的解决方案.如果 d 是 n 的很大一部分,那么这可以变成一个 O(n2) 算法,如果 d 很大,那么你可以完全炸掉堆栈.

First off, you are absolutely correct; if the graph has n nodes of average depth d then the naive nested iterators yield a solution which is O(n*d) in time, and O(d) in stack. If d is a large fraction of n then this can become an O(n2) algorithm, and if d is large then you can blow the stack entirely.

如果您对嵌套迭代器的性能分析感兴趣,请参阅前 C# 编译器开发人员 Wes Dyer 的博文:

If you are interested in a performance analysis of nested iterators, see former C# compiler developer Wes Dyer's blog post:

http://blogs.msdn.microsoft.com/wesdyer/2007/03/23/all-about-iterators

dasblinkenlight 的解决方案是标准方法的变体.我通常会这样编写程序:

dasblinkenlight's solution is a variation on the standard approach. I would typically write the program like this:

public static IEnumerable<T> Traverse<T>(
    T root, 
    Func<T, IEnumerable<T>> children)
{
    var stack = new Stack<T>();
    stack.Push(root);
    while(stack.Count != 0)
    {
        T item = stack.Pop();
        yield return item;
        foreach(var child in children(item))
            stack.Push(child);
    }
}

然后如果你有多个根:

public static IEnumerable<T> Traverse<T>(
    IEnumerable<T> roots, 
    Func<T, IEnumerable<T>> children)
{
    return from root in roots 
           from item in Traverse(root, children)
           select item ;
}

现在,请注意,如果您有一个高度互连的图或循环图,则遍历不是您想要的!如果您有一个带有向下箭头的图表:

Now, note that a traversal is not what you want if you have a highly interconnected graph or a cyclic graph! If you have a graph with downward pointing arrows:

          A
         / 
        B-->C
          /
          D

那么遍历是 A、B、D、C、D、C、D.如果你有一个循环图或互连图,那么你想要的是传递闭包.

then the traversal is A, B, D, C, D, C, D. If you have a cyclic or interconnected graph then what you want is the transitive closure.

public static IEnumerable<T> Closure<T>(
    T root, 
    Func<T, IEnumerable<T>> children)
{
    var seen = new HashSet<T>();
    var stack = new Stack<T>();
    stack.Push(root);

    while(stack.Count != 0)
    {
        T item = stack.Pop();
        if (seen.Contains(item))
            continue;
        seen.Add(item);
        yield return item;
        foreach(var child in children(item))
            stack.Push(child);
    }
}

此变体仅生成以前未生成的项目.

This variation only yields items that have not been yielded before.

我也不太擅长消除递归.

I also happen to not be very good at eliminating recursion.

我写了许多关于消除递归的方法以及一般递归编程的文章.如果您对这个主题感兴趣,请参阅:

I have written a number of articles on ways to eliminate recursion, and about recursive programming in general. If this subject interests you, see:

http://blogs.msdn.com/b/ericlippert/存档/标签/递归/

特别是:

http://blogs.msdn.com/b/ericlippert/archive/2005/08/01/recursion-part-two-unrolling-a-recursive-function-with-an-explicit-stack.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/04/recursion-part-three-building-a-dispatch-engine.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/08/recursion-part-four-continuation-passing-style.aspx

这篇关于使用 LINQ 进行高效的图遍历 - 消除递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆