有向无环图中所有节点的可达性计数 [英] Reachability Count for all nodes in a Directed Acyclic Graph

查看:114
本文介绍了有向无环图中所有节点的可达性计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,在Hackerrank上的一个名为无环图的编程竞赛中遇到了这个挑战,这基本上归结为计算定向非循环图中每个节点可到达的节点数。例如,假设你有这样一个图:

  [1] ----> [2] --- > [4] ---> [5] 
[3] ------ /

可达性计数(包括原始节点):


 节点1:4 
节点2:3
节点3:4
节点4:2
节点5:1

我的方法是深度优先穿越记忆。看着周围很多,但似乎运行时间不能进一步改善,因为在这种情况下发生的过度计数:

  [1] ----> [2] ---> [4] ---> [5] 
[3] ------ / ------- /

第三个节点会统计第四个节点,即使第二个节点节点已经计入第四个节点。为了让事情变得更糟,我只用JavaScript解决了这些挑战。它是我的主要语言,我从推动其界限中获得快感。领导委员会中没有人用JavaScript解决它,但我认为这是可能的。比赛结束后,我设法通过24个测试用例中的13个用下面的代码:

 函数解决方案(图形,节点) {

var memory = new Array(nodes + 1)
,result = 0; (a,v)=> DepthFirstSearch(graph,v,memory));

graph.forEach

//挑战要求输出变化,但每个节点的准确
//可达性计数将包含在d.length中。 (d,i)=> {if(i&(2 * d.length)> =节点)result ++;});

返回结果;


函数DepthFirstSearch(graph,v,memory){

if(memory [v])return memory [v];

var descendants = new Uint16Array([v]);

graph [v] .forEach(u => {

descendants = MergeTypedArrays(
DepthFirstSearch(graph,u,memory),
descendants
);
});
//使元素唯一
//避免重复计算
返回内存[v] = Uint16Array.from(new Set(descendants));


函数MergeTypedArrays(a,b){

var c = new a.constructor(a.length + b.length);

c.set(a);
c.set(b,a.length);

return c;
}

//邻接列表
var graph = [
[],// 0
[2],// 1
[4],// 2
[2],// 3
[5],// 4
[] // 5
];

var nodes = 5;

解决方案(图形,节点);

对于大于50kb的所有输入,大概输入一组节点和边50,000个节点和40,000个边缘)。无法识别或设想更快,更高效的内存算法,我完全丧失了接下来要尝试的内容。思考如何使DFS迭代,但我认为记忆数千个数组的内存消耗会使这个问题变得更加严重,这似乎是主要问题。我在Hackerrank上获得了11次失败的测试(与超时相反)的Abort Called和Runtime Error。还尝试过使用union的bitSets,但由于bitSets数组需要足够大以存储多达50,000个数字,因此内存消耗变得更糟。



约束:

  1≤n,m≤5×10 ^ 4 
1≤a(i),b( i)≤n且a(i)≠b(i)
保证图G不包含循环。

只是想澄清一下,自从这个挑战以来,我无法通过所有测试被锁定,这是为了教育目的,主要是优化。我意识到相关的SO帖子指向拓扑排序,但据我所知,拓扑排序仍然会超过上述情况,因此不是一个可行的解决方案。如果我误解了,请赐教。预先感谢您的时间。



问题:如何进一步优化?是否有更高效的方法?

解决方案

深度优先搜索(DFS)是解决此问题的好方法。另一种方式是广度优先搜索(BFS),它也可以并行运行,并且可以很好地进行优化 - 但所有代价都要高得多的代码复杂度。所以我的建议是坚持DFS。



首先,我必须实现apoligize,但是我的JavaScript技能不是很好(即它们不存在),所以我的解决方案如下使用Java,但想法应该很容易移植。



您最初的问题是缺少一个非常重要的细节:我们只需要找到所有可达节点数大于或等于 | V | / 2



为什么那么重要?计算每个节点的可达节点数量非常昂贵,因为我们必须从图中的每个节点开始执行DFS或BFS。但是,如果我们只需要找到具有上述属性的节点,就容易得多。



后继(n)成为可到达的所有节点 n ancestor(n)是可以到达 n 的所有节点。
我们可以使用以下观察来大幅减少搜索空间:
$ b $ ul

  • 如果从 n可到达的节点数小于 | V |如果从 n 可到达的节点的数量,那么后继(n)中的节点可以具有更大的数量
  • em>大于或等于 | V | / 2 ,那么祖先(n)中的所有节点将具有更大数量


    我们可以使用它吗?


    1. 创建图形时,还要创建转置图形。这意味着当存储边a-> b时,您将b-> a存储在转置图中。
    2. 创建一个数组,存储要忽略的节点,并用<$ c $初始化它c> false

    3. 实现一个基于DFS的函数,该函数确定给定节点是否有多个可达节点> = | V | / b

    4. 在该函数中,忽略标记为忽略的节点

    5. 如果对于节点 n 节点的数量小于 | V | / 2 时,将后继(n)中的所有节点标记为忽略

    6. 否则统计祖先(n)中的所有节点并将它们标记为忽略

    使用迭代DFS的解决方案

      public int countReachable(int root,boolean [] visited,boolean [] ignored,Graph graph){
    if(ignored [root]){
    返回0;
    }

    Stack< Integer> stack = new Stack<>();
    stack.push(root);

    int count = 0;
    while(stack.empty()== false){
    int node = stack.pop();
    if(visited [node] == false){
    count ++;
    visited [node] = true;
    for(int neighbor:graph.getNeighbors(node)){
    if(visited [neighbor] == false){
    stack.push(neighbor);



    $ b if(count * 2> = graph.numNodes()){
    return markAndCountAncestors(root,visited,忽略,图);
    } else {
    return markSuccessors(root,visited,ignored,graph);


    $ / code $ / pre

    标记祖先的功能



    这只是另一个DFS,但使用了转置图。请注意,我们可以重用 visited 数组,因为我们将使用的所有值都是 false ,因为这是一个非循环图。

    pre $ public int markAndCountAncestors(int root,boolean [] visited,boolean [] ignored,Graph graph){
    Stack< ;整数> stack = new Stack<>();
    stack.push(root);
    visited [root] = false;

    int count = 0;
    while(stack.empty()== false){
    int node = stack.pop();
    if(visited [node] == false&&&&& ignored [node] == false){
    count ++;
    visited [node] = true;
    ignored [node] = true;
    for(int neighbor:graph.transposed.getNeighbors(node)){
    if(visited [neighbor] == false&&&& ignored [node] == false){
    stack .push(邻居);
    }
    }
    }
    }
    return count;
    }

    标记后继的功能



    请注意,我们已经有了继承者,因为它们只是我们将 visited 设置为true的节点。

      public int markSuccessors(int root,boolean [] visited,boolean [] ignored,Graph graph){
    for(int node = 0;节点< graph.numNodes(); node ++){
    if(visited [node]){
    ignored [node] = true;
    }
    }
    返回0;
    }

    计算结果的函数

      public void solve(Graph graph){
    int count = 0;
    boolean [] visited = new boolean [graph.numNodes()];
    boolean [] ignored = new boolean [graph.numNodes()];
    for(int node = 0; node< graph.numNodes(); node ++){
    Arrays.fill(visited,false); //重置访问数组
    count + = countReachable(节点,访问,忽略,图);
    }
    System.out.println(Result:+ count);
    }

    在您发布的大型测试用例中,这对我来说运行时间为7.5秒。如果你反转迭代次序(例如,在 solve 中,你从最大的节点id开始),它会下降到4秒,但这有点像作弊^^


    So there was this challenge on a programming contest on Hackerrank called "Acyclic Graph", which basically boils down to counting the number of nodes reachable from every node in a "Directed Acyclic Graph". For example, say you have a graph like so:

    [ 1 ] ---->[ 2 ]--->[ 4 ]--->[ 5 ]
    [ 3 ] ------/
    

    Reachability count (including origin node):

    Node 1: 4
    Node 2: 3
    Node 3: 4
    Node 4: 2
    Node 5: 1
    

    My approach was a "Depth First" traversal with memoization. Looked around quite a bit, but it seems as though the run time can't be improved much further because of the over counting that occurs in cases like so:

    [ 1 ] ---->[ 2 ]--->[ 4 ]--->[ 5 ]
    [ 3 ] ------/--------/
    

    The third node would count the fourth node, even though the second node already counted the fourth node. To make things a bit worse, I only solve these challenges in JavaScript. Its my primary language and I get a thrill from pushing its boundaries. No one on the leader board has solved it in JavaScript yet, but I presume it's possible. After the contest, I managed to pass 13 out of 24 test cases with the following code:

    function Solution( graph, nodes ) {
    
        var memory = new Array( nodes + 1 )
          , result = 0;
    
        graph.forEach( ( a, v ) => DepthFirstSearch( graph, v, memory ) );
    
        // challenge asks for an output variation, but the accurate 
        // reachability count of every node will be contained in "d.length".
        memory.forEach( ( d, i ) => { if ( i && ( 2 * d.length ) >= nodes ) result++; } );
    
        return result;
    }
    
    function DepthFirstSearch( graph, v, memory ) {
    
        if ( memory[ v ] ) return memory[ v ];
    
        var descendants = new Uint16Array( [ v ] );
    
        graph[ v ].forEach( u => {
    
            descendants = MergeTypedArrays( 
                DepthFirstSearch( graph, u, memory ),  
                descendants
            );
        } );
                                               // make elements unique
                                               // to avoid over counting
        return memory[ v ] = Uint16Array.from( new Set( descendants ) ); 
    }
    
    function MergeTypedArrays(a, b) {
    
        var c = new a.constructor( a.length + b.length );
    
        c.set( a );
        c.set( b, a.length );
    
        return c;
    }
    
    // adjacency list
    var graph = [ 
        [],    // 0
        [ 2 ], // 1
        [ 4 ], // 2
        [ 2 ], // 3
        [ 5 ], // 4
        []     // 5
    ];
    
    var nodes = 5;
    
    Solution( graph, nodes );
    

    It fails for all inputs greater than 50kb, presumably inputs with a large set of nodes and edges (i.e. 50,000 nodes and 40,000 edges). Failing to identify or conceive a faster, more memory efficient algorithm, I'm at a total loss as to what to try next. Thought about making the DFS iterative, but I'm thinking the memory consumption of memoizing thousands of arrays will dwarf that, which seems to be the main problem. I get "Abort Called" and "Runtime Error" on Hackerrank for the 11 tests that fail (as oppose to "Timeout"). Also tried "bitSets" with "union", but the memory consumption turned out to be worse since the bitSets arrays need to be large enough to store numbers up to 50,000.

    Constraints:

    1 ≤ n,m ≤ 5×10^4
    1 ≤ a(i),b(i) ≤ n and a(i) ≠ b(i)
    It is guaranteed that graph G does not contain cycles.
    

    Just want to make it clear that I won't get any points for passing all tests since this challenge is locked, this is for educational purposes, mainly on optimization. I'm aware of related SO posts that point to topological sort, but as far as I understand, topological sort will still over count on cases like the one described above, thus not a viable solution. If I misunderstood, please enlighten me. Thank you in advance for your time.

    Question: How can I optimize this further? Is there a more efficient approach?

    解决方案

    Depth-First Search (DFS) is one good way of solving this problem. Another way would be Breadth-First Search (BFS) which cal also run in parallel and can be optimized very well - but all at the cost of much higher code complexity. So my recommendation would be to stick to DFS.

    First I have to apoligize, but my JavaScript skills are not very good (i.e. they are non existent) so my solutions below are using Java but the ideas should be easy to port.

    Your initial question is missing one very important detail: We only need to find all nodes where the number of reachable nodes is larger or equal than |V| / 2

    Why does that matter? Computing the number of reachable nodes for each node is expensive as we have to do a DFS or BFS starting from every node in the graph. But if we only need to find nodes with the above property, that is much easier.

    Let successors(n) be all nodes reachable from n and ancestor(n) be all nodes that can reach n. We can use the following observations to drasticly reduce the search space:

    • if the number of nodes reachable from n is smaller than |V| / 2 then no node in successors(n) can have a larger number
    • if the number of nodes reachable from n is greater or equal than |V| / 2 then all nodes in ancestors(n) will have a larger number

    How can we use that?

    1. When creating your graph, also create the transposed graph. That means when storing an edge a->b, you store b->a in the transposed graph.
    2. Create an array that stores which nodes to ignore, initialize it with false
    3. Implement a DFS based function that determines whether a given node has a number of reachable nodes >= |V| / 2 (see below)
    4. In that function, ignore nodes that are marked as ignored
    5. If for node n the number of nodes is smaller than |V| / 2, mark all nodes in successors(n) as ignored
    6. Else count all nodes in ancestors(n) and mark them as ignored

    Solution using Iterative DFS

    public int countReachable(int root, boolean[] visited, boolean[] ignored, Graph graph) {
        if (ignored[root]) {
            return 0;
        }
    
        Stack<Integer> stack = new Stack<>();
        stack.push(root);
    
        int count = 0;
        while (stack.empty() == false) {
            int node = stack.pop();
            if (visited[node] == false) {
                count++;
                visited[node] = true;
                for (int neighbor : graph.getNeighbors(node)) {
                    if (visited[neighbor] == false) {
                        stack.push(neighbor);
                    }
                }
            }
        }
        if (count * 2 >= graph.numNodes()) {
            return markAndCountAncestors(root, visited, ignored, graph);
        } else {
            return markSuccessors(root, visited, ignored, graph);
        }
    }
    

    Function to mark the Ancestors

    This is just another DFS but using the transposed graph. Note that we can reuse the visited array as all values that we will use are false since this is an acyclic graph.

    public int markAndCountAncestors(int root, boolean[] visited, boolean[] ignored, Graph graph) {   
        Stack<Integer> stack = new Stack<>();
        stack.push(root);
        visited[root] = false;
    
        int count = 0;
        while (stack.empty() == false) {
            int node = stack.pop();
            if (visited[node] == false && ignored[node] == false) {
                count++;
                visited[node] = true;
                ignored[node] = true;
                for (int neighbor : graph.transposed.getNeighbors(node)) {
                    if (visited[neighbor] == false && ignored[node] == false) {
                        stack.push(neighbor);
                    }
                }
            }
        }
        return count;
    }
    

    Function to mark the Successors

    Note that we already have the successors since they are just the nodes where we set visited to true.

    public int markSuccessors(int root, boolean[] visited, boolean[] ignored, Graph graph) {
        for(int node = 0; node < graph.numNodes(); node++) {
            if (visited[node)) {
                ignored[node] = true;
            }
        }
        return 0;
    }
    

    Function to compute the result

    public void solve(Graph graph) {
        int count = 0;
        boolean[] visited = new boolean[graph.numNodes()];
        boolean[] ignored = new boolean[graph.numNodes()];
        for (int node = 0; node < graph.numNodes(); node++) {
            Arrays.fill(visited, false); // reset visited array
            count += countReachable(node, visited, ignored, graph);
        }
        System.out.println("Result: " + count);
    }
    

    On the large test-case you posted, this runs in 7.5 seconds for me. If you invert the iteration order (i.e. in solve you start with the largest node id) it goes down to 4 seconds, but that somewhat feels like cheating ^^

    这篇关于有向无环图中所有节点的可达性计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆