将有向无环图 (DAG) 转换为树 [英] Converting Directed Acyclic Graph (DAG) to tree

查看:68
本文介绍了将有向无环图 (DAG) 转换为树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实现算法以将有向无环图转换为树(为了好玩,学习,kata,命名).于是我想出了Node的数据结构:

I'm trying to implement algoritm to convert Directed Acyclic Graph to Tree (for fun, learining, kata, name it). So I come up with the data structure Node:

/// <summary>
/// Represeting a node in DAG or Tree
/// </summary>
/// <typeparam name="T">Value of the node</typeparam>
public class Node<T> 
{
    /// <summary>
    /// creats a node with no child nodes
    /// </summary>
    /// <param name="value">Value of the node</param>
    public Node(T value)
    {
        Value = value;
        ChildNodes = new List<Node<T>>();
    }

    /// <summary>
    /// Creates a node with given value and copy the collection of child nodes
    /// </summary>
    /// <param name="value">value of the node</param>
    /// <param name="childNodes">collection of child nodes</param>
    public Node(T value, IEnumerable<Node<T>> childNodes)
    {
        if (childNodes == null)
        {
            throw new ArgumentNullException("childNodes");
        }
        ChildNodes = new List<Node<T>>(childNodes);
        Value = value;
    }

    /// <summary>
    /// Determines if the node has any child node
    /// </summary>
    /// <returns>true if has any</returns>
    public bool HasChildNodes
    {
        get { return this.ChildNodes.Count != 0; }
    }


    /// <summary>
    /// Travearse the Graph recursively
    /// </summary>
    /// <param name="root">root node</param>
    /// <param name="visitor">visitor for each node</param>
    public void Traverse(Node<T> root, Action<Node<T>> visitor)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (visitor == null)
        {
            throw new ArgumentNullException("visitor");
        }

        visitor(root); 
        foreach (var node in root.ChildNodes)
        {
            Traverse(node, visitor);
        }
    }

    /// <summary>
    /// Value of the node
    /// </summary>
    public T Value { get; private set; }

    /// <summary>
    /// List of all child nodes
    /// </summary>
    public List<Node<T>> ChildNodes { get; private set; }
}

这很简单.方法:

/// <summary>
/// Helper class for Node 
/// </summary>
/// <typeparam name="T">Value of a node</typeparam>
public static class NodeHelper
{
    /// <summary>
    /// Converts Directed Acyclic Graph to Tree data structure using recursion.
    /// </summary>
    /// <param name="root">root of DAG</param>
    /// <param name="seenNodes">keep track of child elements to find multiple connections (f.e. A connects with B and C and B also connects with C)</param>
    /// <returns>root node of the tree</returns>
    public static Node<T> DAG2TreeRec<T>(this Node<T> root, HashSet<Node<T>> seenNodes)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (seenNodes == null)
        {
            throw new ArgumentNullException("seenNodes");
        }

        var length = root.ChildNodes.Count;
        for (int i = 0; i < length; ++i)
        {
            var node = root.ChildNodes[i];
            if (seenNodes.Contains(node))
            {
                var nodeClone = new Node<T>(node.Value, node.ChildNodes);
                node = nodeClone;
            }
            else
            {
                seenNodes.Add(node);
            }
            DAG2TreeRec(node, seenNodes);
        }
        return root;
    }
    /// <summary>
    /// Converts Directed Acyclic Graph to Tree data structure using explicite stack.
    /// </summary>
    /// <param name="root">root of DAG</param>
    /// <param name="seenNodes">keep track of child elements to find multiple connections (f.e. A connects with B and C and B also connects with C)</param>
    /// <returns>root node of the tree</returns>
    public static Node<T> DAG2Tree<T>(this Node<T> root, HashSet<Node<T>> seenNodes)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (seenNodes == null)
        {
            throw new ArgumentNullException("seenNodes");
        }

        var stack = new Stack<Node<T>>();
        stack.Push(root);

        while (stack.Count > 0) 
        {
            var tempNode = stack.Pop();
            var length = tempNode.ChildNodes.Count;
            for (int i = 0; i < length; ++i)
            {
                var node = tempNode.ChildNodes[i];
                if (seenNodes.Contains(node))
                {
                    var nodeClone = new Node<T>(node.Value, node.ChildNodes);
                    node = nodeClone;
                }
                else
                {
                    seenNodes.Add(node);
                }
               stack.Push(node);
            }
        } 
        return root;
    }
}

并测试:

    static void Main(string[] args)
    {
        // Jitter preheat
        Dag2TreeTest();
        Dag2TreeRecTest();

        Console.WriteLine("Running time ");
        Dag2TreeTest();
        Dag2TreeRecTest();

        Console.ReadKey();
    }

    public static void Dag2TreeTest()
    {
        HashSet<Node<int>> hashSet = new HashSet<Node<int>>();

        Node<int> root = BulidDummyDAG();

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        var treeNode = root.DAG2Tree<int>(hashSet);
        stopwatch.Stop();

        Console.WriteLine(string.Format("Dag 2 Tree = {0}ms",stopwatch.ElapsedMilliseconds));

    }

    private static Node<int> BulidDummyDAG()
    {
        Node<int> node2 = new Node<int>(2);
        Node<int> node4 = new Node<int>(4);
        Node<int> node3 = new Node<int>(3);
        Node<int> node5 = new Node<int>(5);
        Node<int> node6 = new Node<int>(6);
        Node<int> node7 = new Node<int>(7);
        Node<int> node8 = new Node<int>(8);
        Node<int> node9 = new Node<int>(9);
        Node<int> node10 = new Node<int>(10);
        Node<int> root  = new Node<int>(1);

        //making DAG                   
        root.ChildNodes.Add(node2);    
        root.ChildNodes.Add(node3);    
        node3.ChildNodes.Add(node2);   
        node3.ChildNodes.Add(node4);   
        root.ChildNodes.Add(node5);    
        node4.ChildNodes.Add(node6);   
        node4.ChildNodes.Add(node7);
        node5.ChildNodes.Add(node8);
        node2.ChildNodes.Add(node9);
        node9.ChildNodes.Add(node8);
        node9.ChildNodes.Add(node10);

        var length = 10000;
        Node<int> tempRoot = node10; 
        for (int i = 0; i < length; i++)
        {
            var nextChildNode = new Node<int>(11 + i);
            tempRoot.ChildNodes.Add(nextChildNode);
            tempRoot = nextChildNode;
        }

        return root;
    }

    public static void Dag2TreeRecTest()
    {
        HashSet<Node<int>> hashSet = new HashSet<Node<int>>();

        Node<int> root = BulidDummyDAG();

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        var treeNode = root.DAG2TreeRec<int>(hashSet);
        stopwatch.Stop();

        Console.WriteLine(string.Format("Dag 2 Tree Rec = {0}ms",stopwatch.ElapsedMilliseconds));
    }

此外,数据结构需要一些改进:

What is more, data structure need some improvment:

  • 覆盖 GetHash、toString、Equals、== 运算符
  • 实现 IComparable
  • LinkedList 可能是更好的选择

此外,在转换之前,需要检查一些特定的东西:

Also, before the conversion there are certian thigs that need to be checked:

  • 多重图
  • 如果是 DAG(循环)
  • DAG 中的钻石
  • DAG 中的多个根

总而言之,它可以归结为几个问题:我怎样才能提高转换率?由于这是一个递归,所以有可能炸毁堆栈.我可以添加堆栈来记住它.如果我做延续传递风格,我会更有效率吗?

All in all, it narrows down to a few questions: How can I improve the conversion? Since this is a recurion it's possible to blow up the stack. I can add stack to memorize it. If I do continuation-passing style, will I be more efficient?

我觉得这种情况下不可变的数据结构会更好.正确吗?

I feel that immutable data structure in this case would be better. Is it correct?

Childs 是正确的名字吗?:)

Is Childs the right name ? :)

推荐答案

算法:

  • 如您所见,某些节点在输出中出现了两次.如果节点 2 有子节点,则整个子树将出现两次.如果你想让每个节点只出现一次,替换

  • As you observed, some nodes appear twice in the output. If the node 2 had children, the whole subtree would appear twice. If you want each node to appear just once, replace

if (hashSet.Contains(node))
{
    var nodeClone = new Node<T>(node.Value, node.Childs);
    node = nodeClone;
}

if (hashSet.Contains(node))
{
    // node already seen -> do nothing
}

  • 我不会太担心堆栈的大小或递归的性能.但是,您可以将深度优先搜索替换为 广度优先搜索导致更早访问更接近根的节点,从而产生更自然"的树(在您的图片中,您已经按照 BFS 顺序对节点进行了编号).

  • I wouldn't be too worried about the size of the stack or performance of recursion. However, you could replace your Depth-first-search with Breadth-first-search which would result in nodes closer to the root being visited earlier, thus yielding a more "natural" tree (in your picture you already numbered the nodes in BFS order).

     var seenNodes = new HashSet<Node>();
     var q = new Queue<Node>();
     q.Enqueue(root);
     seenNodes.Add(root);   
    
     while (q.Count > 0) {
         var node = q.Dequeue();
         foreach (var child in node.Childs) {
             if (!seenNodes.Contains(child )) {
                 seenNodes.Add(child);
                 q.Enqueue(child);
         }
     }
    

    该算法处理菱形和循环.

    The algorithm handles diamonds and cycles.

    多个根

    只需声明一个包含所有顶点的类 Graph

    Just declare a class Graph which will contain all the vertices

    class Graph
    {
        public List<Node> Nodes { get; private set; }
        public Graph()
        {
            Nodes = new List<Node>();
        }    
    }
    

  • 代码:

    • hashSet 可以命名为 seenNodes.

    代替

    var length = root.Childs.Count;
    for (int i = 0; i < length; ++i)
    {
        var node = root.Childs[i];
    

    foreach (var child in root.Childs)
    

  • 在 Traverse 中,访问者是不必要的.你宁愿有一个方法来产生树的所有节点(以相同的顺序遍历)并且用户可以对节点做任何事情:

  • In Traverse, the visitor is quite unnecessary. You could rather have a method which yields all the nodes of the tree (in the same order traverse does) and it is up to user to do whatever with the nodes:

    foreach(var node in root.TraverseRecursive())
    {
        Console.WriteLine(node.Value);
    }
    

  • 如果覆盖 GetHashCode 和 Equals,算法将无法再区分具有相同值的两个不同节点,这可能不是您想要的.

  • If you override GetHashCode and Equals, the algorithm will no more be able to distinguish between two different Nodes with same value, which is probably not what you want.

    除了 List 在添加节点时执行的重新分配(容量 2、4、8、16,...)之外,我不认为 LinkedList 在这里比 List 更好.

    I don't see any reason why LinkedList would be better here than List, except for the reallocations (Capacity 2,4,8,16,...) which List does when adding nodes.

    这篇关于将有向无环图 (DAG) 转换为树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆