Haskell:二叉树深度的尾递归版本 [英] Haskell: tail recursion version of depth of binary tree

查看:19
本文介绍了Haskell:二叉树深度的尾递归版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我有两种不同的实现,我认为它们是正确的,并且对它们进行了分析并认为它们具有相同的性能:

First of all I have two different implementation that I believe are correct, and have profiled them and thinking they are about of the same performance:

depth::Tree a -> Int
depth Empty        = 0
depth (Branch b l r) = 1 + max (depth l) (depth r)


depthTailRec::Tree a -> Int
depthTailRec = depthTR 0 where
           depthTR d Empty          = d 
           depthTR d (Branch b l r) = let dl = depthTR (d+1) l; dr = depthTR (d+1) r in max dl dr 

我只是想知道人们不是在谈论尾递归如何对性能有益吗?很多问题都在我脑海中浮现:

I was just wondering aren't people are talking about how tail recursion can be beneficial for performance? And a lot of questions are jumping into my head:

  1. 如何使深度函数更快?
  2. 我读过一些关于 Haskell 的懒惰如何减少尾递归需求的文章,这是真的吗?
  3. 每次递归都可以转化为尾递归是真的吗?
  4. 最后,尾递归可以更快、更节省空间,因为它可以变成循环,从而减少推入和弹出堆栈的需要,我的理解对吗?

推荐答案

1.为什么你的函数不是尾递归的?

要使递归函数成为尾递归,所有递归调用都必须在尾位置.如果一个函数在函数返回前最后被调用,则该函数处于尾部位置.在你的第一个例子中,你有

1. Why isn't your function tail recursive?

For a recursive function to be tail recursive, all the recursive calls must be in tail position. A function is in tail position if it is the last thing to be called before the function returns. In your first example you have

depth (Branch _ l r) = 1 + max (depth l) (depth r)

相当于

depth (Branch _ l r) = (+) 1 (max (depth l) (depth r))

在函数返回之前调用的最后一个函数是(+),所以这不是尾递归.在你的第二个例子中,你有

The last function called before the function returns is (+), so this is not tail recursive. In your second example you have

depthTR d (Branch _ l r) = let dl = depthTR (d+1) l
                               dr = depthTR (d+1) r
                            in max dl dr

这相当于(一旦你重新整理了所有的 let 语句)并重新排列了一下

which is equivalent to (once you've re-lambdified all the let statements) and re-arranged a bit

depthTR d (Branch _ l r) = max (depthTR (d+1) r) (depthTR (d+1) l)

现在返回之前调用的最后一个函数是max,这意味着这也不是尾递归.

Now the last function called before returning is max, which means that this is not tail recursive either.

您可以使用继续传递风格来创建尾递归函数.不是重新编写函数以获取状态或累加器,而是传入一个函数(称为 continuation),该函数是关于如何处理计算值的指令 - 即而不是立即返回给调用者,您将计算出的任何值传递给延续.这是将任何函数转换为尾递归函数的简单技巧——即使是需要多次调用自身的函数,如 depth 所做的那样.看起来像这样

You can make a tail recursive function using continuation-passing style. Instead of re-writing your function to take a state or an accumulator, you pass in a function (called the continuation) that is an instruction for what to do with the value computed -- i.e. instead of immediately returning to the caller, you pass whatever value you have computed to the continuation. It's an easy trick for turning any function into a tail-recursive function -- even functions that need to call themselves multiple times, as depth does. It looks something like this

depth t = go t id
 where
   go  Empty         k = k 0
   go (Branch _ l r) k = go l $ dl ->
                           go r $ dr ->
                             k (1 + max dl dr)

现在你看到 go 在返回之前调用的最后一个函数本身就是 go,所以这个函数是尾递归的.

Now you see that the last function called in go before it returns is itself go, so this function is tail recursive.

(注意这部分来自 上一个问题.)

(NB this section draws from the answers to this previous question.)

不!这个伎俩"只会将问题推回到其他地方.代替使用大量堆栈空间的非尾递归函数,我们现在有一个尾递归函数,它吃掉 thunk(未应用的函数),这些函数本身可能会占用大量空间.幸运的是,我们不需要使用任意函数——实际上只有三种

No! This "trick" only pushes the problem back somewhere else. Instead of a non-tail recursive function that uses lots of stack space, we now have a tail-recursive function that eats thunks (unapplied functions) which could potentially be taking up a lot of space themselves. Fortunately, we don't need to work with arbitrary functions - in fact, there are only three kinds

  1. <代码>dl ->go r (dr -> k (1 + max dl dr))(使用自由变量 rk)
  2. <代码>dr ->k (1 + max dl dr)(带有自由变量 kdl)
  3. id(没有自由变量)
  1. dl -> go r (dr -> k (1 + max dl dr)) (which uses the free variables r and k)
  2. dr -> k (1 + max dl dr) (with free variables k and dl)
  3. id (with no free variables)

由于函数数量有限,我们可以将它们表示为数据

Since there are only a finite number of functions, we can represent them as data

data Fun a = FunL (Tree a) (Fun a)  -- the fields are 'r' and 'k'
           | FunR  Int     (Fun a)  -- the fields are 'dl' and 'k'
           | FunId

我们还必须编写一个函数 eval,它告诉我们如何评估这些函数";在特定的论点.现在您可以将函数重写为

We'll have to write a function eval as well, which tells us how to evaluate these "functions" at particular arguments. Now you can re-write the function as

depth t = go t FunId
 where
  go  Empty         k = eval k 0
  go (Branch _ l r) k = go l (FunL r k)

  eval (FunL r  k) d = go r (FunR d k)
  eval (FunR dl k) d = eval k (1 + max dl d)
  eval (FunId)     d = d

请注意,goeval 都在尾部位置调用了 goeval —— 因此它们是一对相互尾递归函数.因此,我们将使用 continuation-passing 风格的函数版本转换为使用数据表示 continuation 的函数,并使用一对相互递归的函数来解释该数据.

Note that both go and eval have calls to either go or eval in tail position -- therefore they are a pair of mutually tail recursive functions. So we've transformed the version of the function that used continuation-passing style into a function that uses data to represent continuations, and uses a pair of mutually recursive functions to interpret that data.

嗯,我想是的.可是等等!我们可以简化它!如果您查看 Fun a 数据类型,您会发现它实际上只是一个列表,其中每个元素都是我们要计算的 Tree a的深度,或者它是一个 Int 表示我们迄今为止计算的深度.

Well, I guess it is. But wait! We can simplify it! If you look at the Fun a data type, you'll see that it's actually just a list, where each element is either a Tree a that we're going to compute the depth of, or it's an Int representing a depth that we've computed so far.

注意到这一点有什么好处?嗯,这个列表实际上代表了上一节的延续链的调用堆栈.将新项目推入列表就是将新参数推入调用堆栈!所以你可以写

What's the benefit of noticing this? Well, this list actually represents the call stack of the chain of continuations from the previous section. Pushing a new item onto the list is pushing a new argument onto the call stack! So you could write

depth t = go t []
 where
  go  Empty         k = eval k 0
  go (Branch _ l r) k = go l (Left r : k)

  eval (Left r   : k) d = go   r (Right d : k)
  eval (Right dl : k) d = eval k (1 + max dl d)
  eval  []            d = d

您推入调用堆栈的每个新参数都是 Either (Tree a) Int 类型,并且随着函数的递归,它们不断将新参数推入堆栈,这些新树要么是被探索(每当 go 被调用时)或迄今为止找到的最大深度(每当 eval 被调用时).

Each new argument you push onto the call stack is of type Either (Tree a) Int, and as the functions recurse, they keep pushing new arguments onto the stack, which are either new trees to be explored (whenever go is called) or the maximum depth found so far (whenever eval is called).

这个调用策略代表了树的深度优先遍历,正如你所看到的,左边的树总是被 go 首先探索,而右边的树总是被推到调用堆栈稍后探讨.只有当到达 Empty 分支并且可以丢弃时,参数才会从调用堆栈中弹出(在 eval 中).

This call strategy represents a depth-first traversal of the tree, as you can see by the fact that the left tree is always explored first by go, while the right tree is always pushed onto the call stack to be explored later. Arguments are only ever popped off the call stack (in eval) when an Empty branch has been reached and can be discarded.

好吧,一旦您注意到可以将延续传递算法变成模拟调用堆栈并先遍历树深度的版本,您可能会开始怀疑是否有更简单的算法先遍历树深度,跟踪目前遇到的最大深度.

Well, once you've noticed that you can turn the continuation-passing algorithm into a version that mimics the call stack and traverses the tree depth first, you might start to wonder whether there's a simpler algorithm that traverses the tree depth first, keeping track of the maximum depth encountered so far.

确实有.诀窍是保留您尚未探索的分支列表及其深度,并跟踪您迄今为止看到的最大深度.看起来像这样

And indeed, there is. The trick is to keep a list of branches that you haven't yet explored, together with their depths, and keep track of the maximum depth you've seen so far. It looks like this

depth t = go 0 [(0,t)]
 where
  go depth  []    = depth
  go depth (t:ts) = case t of
    (d, Empty)        -> go (max depth d)  ts
    (d, Branch _ l r) -> go (max depth d) ((d+1,l):(d+1,r):ts)

我认为这与我可以在确保它是尾递归的约束范围内使这个函数一样简单.

I think that's about as simple as I can make this function within the constraints of ensuring that it's tail-recursive.

老实说,您的原始非尾递归版本可能没问题.新版本的空间效率并没有提高(它们总是必须存储您接下来要处理的树的列表),但它们确实具有将接下来要处理的树存储在堆上而不是堆上的优势堆栈 - 堆上还有更多空间.

To be honest, your original, non tail-recursive version is probably fine. The new versions aren't any more space efficient (they always have to store the list of trees that you're going to process next) but they do have the advantage of storing the trees to be processed next on the heap, rather than on the stack - and there's lots more space on the heap.

您可能想查看 Ingo 答案中的部分尾递归函数,这在您的树极度不平衡的情况下会有所帮助.

You might want to look at the partially tail-recursive function in Ingo's answer, which will help in the case when your trees are extremely unbalanced.

这篇关于Haskell:二叉树深度的尾递归版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆