以Newick格式懒地打印树 [英] Printing a tree lazily in Newick format

查看:207
本文介绍了以Newick格式懒地打印树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望以 Newick格式打印二叉树,显示每个节点与父节点的距离。目前我没有遇到下面的代码,它使用常规递归,但树太深可能会产生堆栈溢出的问题。

 (defn tree-> newick 
[tree]
(let [{:keys [id子树to-parent]} tree
dist(double to-parent)];到父母可能是理性的
(如果孩子
(str((tree-> newick第一个孩子))
,(tree-> newick(second children))
):dist)
(str(name id):dist)

(def example {:id nil:to-parent 0.0
:children [{:id nil:to-parent 0.5
:children [{:idA:到父亲0.3:children nil}
{:idB:to-parent 0.2:children nil}]}
{:idC:to-parent 0.8:children nil}]} )

(tree-> newick example)
; => ((A:0.3,B:0.2):0.5,C:0.8):0.0

(def linear-tree( - > 0.1:children nil}
(iterate#(hash-map:id nil:to-parent 0.1
:children [%{:idside:to parent 0.1:children nil}])
(take 10000)
last))

(tree-> newick linear-tree)
; => StackOverflowError

使用当前实用程序找到的问题,例如 seq clojure.walk 是我不得不访问一个内部节点多次,插入逗号和关闭括号。我使用 clojure.zip ,但没有设法写一个延迟/尾递归实现,因为我需要存储每个内部节点多少次

解决方案

以下是一个适用于您的线性树示例。这是一个直接转换您的实现与两个变化:它使用继续传递风格和蹦床。

 (defn tree-& newick 
([tree]
(trampoline tree-> newick tree identity))
([tree cont]
(let [{:keys [id children to-parent] } tree
dist(double to-parent)]; to-parent可能是一个理性的
(如果孩子
(fn []
(tree-> newick
(第一个孩子)
(fn [s1](fn []
(tree-> newick
(s1,s2):dist)))))))))
(cont(str(name id):dist))))



编辑:添加模式匹配,允许以简单的方式调用函数。



编辑2 :我注意到我犯了错误。问题是,我没有考虑到Clojure没有优化尾部调用只是部分考虑。



我的解决方案的核心思想是转换为连续传递风格所以递归调用可以移动到尾部位置(即,不是返回它们的结果,递归调用将其作为参数传递给延续)。



然后我手动优化通过使他们使用蹦床的递归调用。我忘记考虑的是,连续的调用 - 它不是递归调用,但也在尾部位置 - 也需要优化,因为尾调用可以是一个非常长的闭包链,所以当函数最终

这个问题没有用测试数据实现 linear-tree ,因为第一个孩子的延续返回到trampoline来处理第二个孩子的递归调用。但是如果 linear-tree 发生改变,它使用每个节点的第二个子节点来构建线性树而不是第一个子节点,这又会导致堆栈溢出。 / p>

因此,连续调用也需要返回蹦床。 (实际上,在没有孩子基础的情况下的调用没有,因为它将在返回到蹦床之前最多发生一次,然后对于第二个递归调用也是如此)。因此,这里有一个实现,并且应该在所有输入上仅使用常量堆栈空间:

 (defn tree-> newick 

(trampoline tree-> newick tree identity))
([tree cont]
(let [{:keys [id children to-parent]} tree
dist to-parent)];到父母可以是理性的
(如果孩子
(fn [](tree-> newick
(first children)
(fn [s1 ](...)()()()();)))))))$($)$($ s $) b $ b(cont(str(name id):dist))))))


I wish to print a binary tree in Newick format, showing each node's distance to its parent. At the moment I haven't had an issue with the following code, which uses regular recursion, but a tree too deep may produce a stack overflow.

(defn tree->newick
  [tree]
  (let [{:keys [id children to-parent]} tree
        dist (double to-parent)] ; to-parent may be a rational
    (if children
      (str "(" (tree->newick (first children)) 
           "," (tree->newick (second children)) 
           "):" dist)
      (str (name id) ":" dist))))

(def example {:id nil :to-parent 0.0 
              :children [{:id nil :to-parent 0.5 
                          :children [{:id "A" :to-parent 0.3 :children nil}
                                     {:id "B" :to-parent 0.2 :children nil}]}
                         {:id "C" :to-parent 0.8 :children nil}]})

(tree->newick example)
;=> "((A:0.3,B:0.2):0.5,C:0.8):0.0"

(def linear-tree (->> {:id "bottom" :to-parent 0.1 :children nil}
                   (iterate #(hash-map :id nil :to-parent 0.1 
                                       :children [% {:id "side" :to-parent 0.1 :children nil}]))
                   (take 10000)
                   last))

(tree->newick linear-tree)
;=> StackOverflowError

The problem I've found with current utilities, such as tree-seq and clojure.walk, is that I have to visit an inner node more than once, to interpose the comma and close the bracket. I've used clojure.zip, but didn't manage to write a lazy/tail-recursive implementation, as I would need to store for each inner node how many times they were already visited.

解决方案

Here's a version that works on your linear-tree example. It's a direct conversion of your implementation with two changes: it uses continuation passing style and the trampoline.

(defn tree->newick
  ([tree]
     (trampoline tree->newick tree identity))
  ([tree cont]
     (let [{:keys [id children to-parent]} tree
           dist (double to-parent)]     ; to-parent may be a rational
       (if children
         (fn []
           (tree->newick
            (first children)
            (fn [s1] (fn []
                       (tree->newick
                        (second children)
                        (fn [s2] (cont (str "(" s1 "," s2 "):" dist))))))))
         (cont (str (name id) ":" dist))))))

Edit: added pattern matching to allow calling the function in a simple way.

Edit 2: I noticed that I made mistake. The problem is that I did take the fact that Clojure doesn't optimize tail calls only partially into account.

The core idea of my solution is the transformation into continuation passing style so the recursive calls can be moved into tail position (i.e. instead of returning their result, the recursive calls pass it to the continuation as argument).

Then I hand-optimized the recursive calls by making them use the trampoline. What I forgot to consider is that the calls of the continuations - which are not recursive calls but also in tail position - need to be optimized, too, because the tail calls can be a very long chain of closures, so that when the function finally evaluates them, it becomes a long chain of calls.

This problem did not materialize with the the test data linear-tree because the continuation for the first child returns to the trampoline to process recursive call for the second child. But if linear-tree is changed so that it uses the second child of every node to build the linear tree instead of the first child, this does again cause a stack overflow.

So the calls of the continuations need to return to the trampoline, too. (Actually, the call in the no children base case does not, because it will happen at most once before returning to trampoline, and the same will then be true for the second recursive call.) So here's an implementation that does take this into consideration and should use only constant stack space on all inputs:

(defn tree->newick
  ([tree]
     (trampoline tree->newick tree identity))
  ([tree cont]
     (let [{:keys [id children to-parent]} tree
           dist (double to-parent)]     ; to-parent may be a rational
       (if children
         (fn [] (tree->newick
                 (first children)
                 (fn [s1] (tree->newick
                           (second children)
                           (fn [s2] #(cont (str "(" s1 "," s2 "):" dist)))))))
         (cont (str (name id) ":" dist))))))

这篇关于以Newick格式懒地打印树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆