图路径抽象算法所需 [英] Graph Paths Abstraction Algorithm Needed

查看:138
本文介绍了图路径抽象算法所需的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据结构保持像在下面的图片图:

I have a data structure holding a graph like the one in the following picture:

在此树中,节点可以具有从它下面的水平的任何数目的独特的儿童。 在树的图片再presents一组路径。 如果每个路径应首先从第1级的节点,并用*标记的节点结束。 所以树的图像中的路径是:

In this tree, a node can have any number of unique children from the levels below it. In tree in the picture represents a set of paths. Where every path should begin with a node from Level 1, and ends with a node of "*" mark. So the paths of the tree in the picture are:

A then C then G

A then C then G then J

A then D then G

A then D then G the J

A then D then K, and so on...

其实我原来的树是巨大的(200万左右的序列)和每级节点的最大数目为61(11级)。因此,它会导致在我的应用程序(SAMSUNG计算机视觉应用)许多内存消耗问题。

Actually my original tree is huge (around 2 Million sequences) and the maximum number of nodes per level is 61 (of 11 levels). So it causes many memory consumption problems in my application (a computer vision application for SAMSUNG).

我的目标是在一种迭代算法重新$ P $以更紧凑的字符串格式psents这些路径。所以我想我们的问题分为三个步骤如下。我已经建立了树数据结构(步骤2),但仍无法得出一个迭代算法,得到输出字符串/序列中的第3步从树上

My target is to have an iterative algorithm that represents these paths in a more compact string format. So I think we the problem is divided into three steps as follows. I have built the tree data structure (step 2), but still can not derive an iterative algorithm that gets the output string/sequence in step 3 from the tree.

(A C G)| (A C G j)| (A D G)| (A D G j)| (A D K)| ......

在哪里|再presents的替代品。

Where "|" represents alternatives.

(A(C G [J])|(D(G [J])| K))| (B ....)

哪里哪里|再presents替代品和[]包围选项。目标输出字符串应该优化像有没有可以采取更简化更常见的因素。

Where where "|" represents alternatives and "[ ]" encloses options. The target output string should be optimized like there are not more common factors that can be taken to more simplify it.

推荐答案

您可以使用迭代DFS的改进,它利用堆栈跟踪未处理的节点。这个算法从不存储栈 * 为任何一个节点上多于6个字符,并且总有比在栈上N个节点(其中N是节点的图中的数)少。你已经表明,N值将最多61 * 11 = 671,所以会有一个最高约4000元可能在堆栈中。

You can use a modification of iterative DFS, which utilizes a stack to keep track of unprocessed nodes. This algorithm never stores more than 6 characters on the stack* for any one node, and there are always fewer than N nodes on the stack (where N is the number of nodes in the graph). You've indicated that N will be at most 61*11=671, so there will be a maximum of about 4000 elements possible on the stack.

在下面的伪code,A目的地节点是在例如一个星号的节点之上,例如摹 *

In the pseudocode below, a "destination" node is a starred node in the example above, e.g. G*.

初​​始化:

一个虚拟节点披;引入与和披边缘;给每一个根节点,例如节点A和B以上。该令牌&披;被认为是一个非打印字符,或者您也可以将其添加到输出字符串前显式检查。该节点披;在调用之前被压入堆栈。

A dummy node Φ is introduced with an edge from Φ to each of the "root" nodes, e.g. nodes A and B above. The token for Φ is assumed to be a non-printing character, or you can explicitly check before adding it to the output string. The node Φ is pushed onto the stack before calling the function.

outString := ""
while stack not empty
   pop token
   if token is node
      outString := outString + node(token)  // Line 5 - explanation below
      if node(token) has children
         if node(token) is destination
            outString := outString + "["
            push "]"
         end
         if node(token) has multiple children
            for each child of node(token), from right to left
               push ")"
               push child
               push "("
               push "|"
            end
            pop // remove last "|"
         else
            push child
         end
      end

   else // token is ()[]|
      outString := outString + token
   end
end

这个算法为你的图(A及其子女)的第一部分的输出是(与多余的空格增加了清晰度;空间可以轻松地添加到code):

The output of this algorithm for the first part of your graph (A and its children) is (with extra spaces added for clarity; the spaces can be easily added to the code):

A (C G [J]) | (D (G [J]) | (K))

您会注意到你的结果和我之间的偏差:最终节点K​​被封在我的解决方案括号内。如果这是不可取的(这可能导致丑像 A [(B)|(C)] ),你可以通过执行额外的检查,当你弹出消除节点在一些额外的开销成本令牌从堆栈的。只需更换5号线上面:

You'll notice a deviation between your result and mine: the final node K is enclosed in parentheses in my solution. If this is undesirable (it could result in ugliness like A[(B)|(C)]), you can eliminate it by performing an additional check when you pop a node token off of the stack at the cost of some additional overhead. Simply replace Line 5 above with:

if (node(token) has no children
    AND last character of outString is "("
    AND next token on stack is ")")
      remove trailing "(" from outString
      concatenate token to outString
      pop ")" from stack and ignore
else
   outString := outString + node(token) // as above
end

让我知道如果您有任何疑问或我已经错过了什么。

Let me know if you have any questions or I've missed anything.

* 这将在一个节点(可能是极不可能)的情况下发生的被写成 | [(A)] 。大多数节点将占用4个或更少的字符在堆栈中。

* This will happen in the (probably highly unlikely) case of a node being written as |[(A)]. Most nodes will take up 4 or fewer characters in the stack.

这篇关于图路径抽象算法所需的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆