Haskell:二叉树深度的递归版本 [英] Haskell: tail recursion version of depth of binary tree

查看:189
本文介绍了Haskell:二叉树深度的递归版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我有两种不同的实现方式,我认为它们是正确的,并且已经对它们进行了描述,并认为它们具有相同的性能:

  depth:Tree a  - > Int b $ b深度Empty = 0 
depth(Branch b l r)= 1 + max(depth l)(depth r)


depthTailRec :: Tree a - > Int
depthTailRec = depthTR 0其中
depthTR d Empty = d
depthTR d(Branch b l r)= let dl = depthTR(d + 1)l; dr = depthTR(d + 1)r in max dl dr

我只是想知道是不是人是在谈论尾部递归如何有利于性能?还有很多问题都在我脑海中浮现:


  1. 如何让深度函数更快?
  2. 我读了关于Haskell的懒惰如何减少尾递归需求的内容,这是真的吗?

  3. 是否每个递归都可以转换为尾递归? li>
  4. 最后,尾递归可以更快,更节省空间,因为它可以变成循环,因此减少了推动和弹出栈的需求,我的理解是正确的吗?


解决方案

1。为什么不是函数尾递归?



对于递归函数是尾递归,所有递归调用必须位于尾部位置。如果函数返回前最后一次被调用,则该函数处于尾部位置。在你的第一个例子中,你有

  depth(Branch _ lr)= 1 + max(depth l)(depth r)

相当于

 深度(Branch_lr)=(+)1(max(深度l)(深度r))

,所以你可以看到函数返回前调用的最后一个函数是(+),所以这不是尾递归。在你的第二个例子中,你有

  depthTR d(Branch _ lr)= let dl = depthTR(d + 1)l 
dr = depthTR(d + 1)r
in max dl dr

即相当于(一旦你已经对所有的 let 语句重新进行了lambda化),并重新排列了一下



<$ p $ (深度TR(d + 1)r)(深度TR(d + 1)l)

,所以你看到在返回之前调用的最后一个函数是 max ,这意味着这也不是尾递归。

2。你怎么能使它尾递归?



你可以使用继续传递样式制作尾递归函数。您不必重写函数来获取状态或累加器,而是传递一个函数(称为 continuation ),该函数是用于处理计算值的指令 - 即不是立即返回给调用者,你将你计算的任何值传递给继续。将任何函数转换为尾递归函数都是一个简单的技巧 - 甚至需要多次调用自己的函数,如 depth 所做的那样。它看起来像这样

 深度t =去t id 
其中
去空k = k 0
go(Branch _ lr)k = go l $ \dl - >
go r $ \dr - >
k(1 + max dl dr)

现在你可以看到在 go 在它返回之前本身是 go ,所以这个函数是尾递归的。



3。那是吗?



(注意,本节从)。



不!这个诀窍只会把问题推回别的地方。我们现在没有使用大量堆栈空间的非尾递归函数,而是使用一个尾递归函数来吞食可能会占用大量空间的thunks(未应用函数)。幸运的是,我们不需要使用任意函数 - 事实上,只有三种类型:
$ b


  1. \\ \\ dl - > (使用自由变量 r 和<$ c $($ \\ dr - > k(1 + max dl dr)) c> k )

  2. \dr - > k(1 + max dl dr)(自由变量 k dl

  3. id (没有自由变量)

由于函数的数量是有限的,我们可以将它们表示为数据。

pre $ 数据Fun a = FunL(Tree a)(趣味a) - 字段是'r'和'k'
| FunR Int(趣味a) - 字段是'dl'和'k'
| FunId

我们必须编写一个函数 eval ,它告诉我们如何在特定参数下评估这些功能。现在,您可以将函数重新编写为

  depth t = go t FunId 
其中
变为空k = eval k 0
go(Branch_lr)k = go l(FunL rk)

eval(FunL rk)d = go r(FunR dk)
eval FunR dl k)d = eval k(1 + max dl d)
eval(FunId)d = d

请注意, go eval 都会调用 go eval 在尾部位置 - 因此它们是一对相互尾递归函数。因此,我们将使用continuation-passing风格的函数版本转换为一个函数,该函数使用数据表示延续,并使用一对相互递归的函数来解释该数据。



4。这听起来很真实。



嗯,我想是的。可是等等!我们可以简化它!如果您看看 Fun a 数据类型,您会发现它实际上只是一个列表,其中每个元素都是 Tree a 我们要计算深度,或者它是一个 Int ,代表我们到目前为止计算的深度。



注意到这个有什么好处?那么这个列表实际上代表了上一节中延续链的调用栈。将一个新项目推入列表中会将新参数推送到调用堆栈中!所以你可以写下:

  depth t = go t [] 
其中
变空k = eval k 0
go(Branch_lr)k = go l(Left r:k)

eval(Left r:k)d = go r(Right d:k)
eval(Right dl:k)d = eval k(1 + max dl d)
eval [] d = d

您推入调用堆栈的每个新参数的类型都是 Either(Tree a)Int ,并且随着函数递归,它们不断推新的参数放到堆栈上,这是要探索的新树(每当调用 go 时)或到目前为止发现的最大深度(每当 eval 被调用)。

这个调用策略代表了树的深度优先遍历,正如您可以看到的那样,左树总是被探索首先是 go ,而右边的树总是被压入调用堆栈中以供稍后探究。只有在分支到达并且可以丢弃时,参数才会从调用堆栈中弹出(在 eval 中) 。

5。好吧...其他的东西?



好的,一旦你注意到你可以将continuation-passing算法变成一个模仿调用堆栈并遍历树的版本首先,你可能会开始怀疑是否有一个更简单的算法,它首先遍历树的深度,跟踪到目前为止遇到的最大深度。



事实上, 。诀窍是保留一份尚未探索的分支列表,以及它们的深度,并跟踪迄今为止所见到的最大深度。它看起来像这样

 深度t =去0 [(0,t)] 
其中
去深度[] =深度
深度(t:ts)=案例t的
(d,空白) - > go(最大深度d)ts
(d,分支_l r) - > (最大深度d)((d + 1,l):( d + 1,r):ts)

我认为这很简单,因为我可以在确保它是尾递归的限制范围内使此函数成为可能。

6。所以这就是我应该使用的?



说实话,你原来的非尾递归版本可能没问题。新版本没有更高的空间效率(总是需要存储下一步将要处理的树列表),但是它们确实具有将堆下一步处理的树存储在堆中的优势,而不是堆栈 - 堆中有更多的空间。



你可能想看看Ingo的答案中的部分尾递归函数,这将有助于在你的树木极不平衡。

First of all I have two different implementation that I believe are correct, and have profiled them and thinking they are about of the same performance:

depth::Tree a -> Int
depth Empty        = 0
depth (Branch b l r) = 1 + max (depth l) (depth r)


depthTailRec::Tree a -> Int
depthTailRec = depthTR 0 where
           depthTR d Empty          = d 
           depthTR d (Branch b l r) = let dl = depthTR (d+1) l; dr = depthTR (d+1) r in max dl dr 

I was just wondering aren't people are talking about how tail recursion can be beneficial for performance? And a lot of questions are jumping into my head:

  1. How can you make the depth function faster?
  2. I read about something about how Haskell's laziness can reduce the need of tail recursion, is that true?
  3. Is it the truth that every recursion can be converted into tail recursion?
  4. Finally tail recursion can be faster and space efficient because it can be turned into loops and thus reduce the need to push and pop the stack, is my understanding right?

解决方案

1. Why isn't your function tail recursive?

For a recursive function to be tail recursive, all the recursive calls must be in tail position. A function is in tail position if it is the last thing to be called before the function returns. In your first example you have

depth (Branch _ l r) = 1 + max (depth l) (depth r)

which is equivalent to

depth (Branch _ l r) = (+) 1 (max (depth l) (depth r))

so you can see that the last function called before the function returns is (+), so this is not tail recursive. In your second example you have

depthTR d (Branch _ l r) = let dl = depthTR (d+1) l
                               dr = depthTR (d+1) r
                            in max dl dr

which is equivalent to (once you've re-lambdified all the let statements) and re-arranged a bit

depthTR d (Branch _ l r) = max (depthTR (d+1) r) (depthTR (d+1) l)

so you see that the last function called before returning is max, which means that this is not tail recursive either.

2. How could you make it tail recursive?

You can make a tail recursive function using continuation-passing style. Instead of re-writing your function to take a state or an accumulator, you pass in a function (called the continuation) that is an instruction for what to do with the value computed -- i.e. instead of immediately returning to the caller, you pass whatever value you have computed to the continuation. It's an easy trick for turning any function into a tail-recursive function -- even functions that need to call themselves multiple times, as depth does. It looks something like this

depth t = go t id
 where
   go  Empty         k = k 0
   go (Branch _ l r) k = go l $ \dl ->
                           go r $ \dr ->
                             k (1 + max dl dr)

Now you see that the last function called in go before it returns is itself go, so this function is tail recursive.

3. Is that it, then?

(NB this section draws from the answers to this previous question.)

No! This "trick" only pushes the problem back somewhere else. Instead of a non-tail recursive function that uses lots of stack space, we now have a tail-recursive function that eats thunks (unapplied functions) which could potentially be taking up a lot of space themselves. Fortunately, we don't need to work with arbitrary functions - in fact, there are only three kinds

  1. \dl -> go r (\dr -> k (1 + max dl dr)) (which uses the free variables r and k)
  2. \dr -> k (1 + max dl dr) (with free variables k and dl)
  3. id (with no free variables)

Since there are only a finite number of functions, we can represent them as data

data Fun a = FunL (Tree a) (Fun a)  -- the fields are 'r' and 'k'
           | FunR  Int     (Fun a)  -- the fields are 'dl' and 'k'
           | FunId

We'll have to write a function eval as well, which tells us how to evaluate these "functions" at particular arguments. Now you can re-write the function as

depth t = go t FunId
 where
  go  Empty         k = eval k 0
  go (Branch _ l r) k = go l (FunL r k)

  eval (FunL r  k) d = go r (FunR d k)
  eval (FunR dl k) d = eval k (1 + max dl d)
  eval (FunId)     d = d

Note that both go and eval have calls to either go or eval in tail position -- therefore they are a pair of mutually tail recursive functions. So we've transformed the version of the function that used continuation-passing style into a function that uses data to represent continuations, and uses a pair of mutually recursive functions to interpret that data.

4. That sounds really compicated

Well, I guess it is. But wait! We can simplify it! If you look at the Fun a data type, you'll see that it's actually just a list, where each element is either a Tree a that we're going to compute the depth of, or it's an Int representing a depth that we've computed so far.

What's the benefit of noticing this? Well, this list actually represents the call stack of the chain of continuations from the previous section. Pushing a new item onto the list is pushing a new argument onto the call stack! So you could write

depth t = go t []
 where
  go  Empty         k = eval k 0
  go (Branch _ l r) k = go l (Left r : k)

  eval (Left r   : k) d = go   r (Right d : k)
  eval (Right dl : k) d = eval k (1 + max dl d)
  eval  []            d = d

Each new argument you push onto the call stack is of type Either (Tree a) Int, and as the functions recurse, they keep pushing new arguments onto the stack, which are either new trees to be explored (whenever go is called) or the maximum depth found so far (whenever eval is called).

This call strategy represents a depth-first traversal of the tree, as you can see by the fact that the left tree is always explored first by go, while the right tree is always pushed onto the call stack to be explored later. Arguments are only ever popped off the call stack (in eval) when an Empty branch has been reached and can be discarded.

5. Alright... anything else?

Well, once you've noticed that you can turn the continuation-passing algorithm into a version that mimics the call stack and traverses the tree depth first, you might start to wonder whether there's a simpler algorithm that traverses the tree depth first, keeping track of the maximum depth encountered so far.

And indeed, there is. The trick is to keep a list of branches that you haven't yet explored, together with their depths, and keep track of the maximum depth you've seen so far. It looks like this

depth t = go 0 [(0,t)]
 where
  go depth  []    = depth
  go depth (t:ts) = case t of
    (d, Empty)        -> go (max depth d)  ts
    (d, Branch _ l r) -> go (max depth d) ((d+1,l):(d+1,r):ts)

I think that's about as simple as I can make this function within the constraints of ensuring that it's tail-recursive.

6. So that's what I should use?

To be honest, your original, non tail-recursive version is probably fine. The new versions aren't any more space efficient (the always have to store the list of trees that you're going to process next) but they do have the advantage of storing the trees to be processed next on the heap, rather than on the stack - and there's lots more space on the heap.

You might want to look at the partially tail-recursive function in Ingo's answer, which will help in the case when your trees are extremely unbalanced.

这篇关于Haskell:二叉树深度的递归版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆