动态编程和记忆化:自下而上VS自上而下的方法 [英] Dynamic programming and memoization: bottom-up vs top-down approaches

查看:801
本文介绍了动态编程和记忆化:自下而上VS自上而下的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道我理解的方式自上而下与记忆化和自下而上的正确方法。

自下而上: 是你先来看看更小的子,然后使用该解决方案,以较小的问题解决了较大的子问题。

自上而下: 解决这个问题的一个自然的方式,并检查是否已计算解决方案的子问题了。

我有点困惑。有人能解释一下吗?什么区别呢?

解决方案
  

rev4:由用户Sammaron一个非常有说服力的评论指出,或许这个答案previously困惑自上而下和自下而上。而原本​​这个答案(REV3)等回答说,自下而上的记忆化(承担的子),也可能是反向(即自上而下可能是假设的子问题和自下而上可能会组成的子)。 previously我已经阅读了记忆化是一个不同类型的动态规划,而不是动态规划的一个亚型,因此被引用,尽管没有订阅这个观点。我已经重写这个答案是不可知的术语,直到适当引用可以在文献中找到,并转换这个答案,一个社区的wiki。请preFER学术资源。参考文献列表:{网站: 1 } {文学: - }

小结

动态编程是所有关于订购您的计算中,你可以避免重新计算重复的工作方式。你有一个主要的问题(你的子树的根),和子问题(子树)。 的子问题通常是重复和重叠的。

例如,考虑你最喜欢的Fibonnaci的例子。这是子问题的满树,如果我们做了一个天真的递归调用:

 树顶
FIB(4)
 FIB(3)...................... + FIB(2)
  FIB(2)......... + FIB(1)FIB(1)........... + FIB(0)
   FIB(1)+ FIB(0)FIB(1)FIB(1)FIB(0)
    FIB(1)FIB(0)
树的底部
 

(在其他一些少见的问题,这棵树可能是无限的一些分支,再presenting非终止,因此树的底部可能是无限大的。此外,在一些问题,你可能不知道完整的树看起来像提前,这样的话你可能需要一个战略/算法来决定哪些子问题显露。)


记忆化,制表

有动态编程的至少两种主要的技术,这是不相互排斥的:

  • 记忆化 - 这是一种放任态度:你认为你已经计算出了所有子问题。你不知道的最佳评估顺序。您通常执行从根递归调用(或迭代同等学历),以及无论是希望你能得到接近最优的计算顺序,或者你有一个证明,你会得到最佳的评估顺序。您保证递归调用永远不会重新计算一个子问题,因为你的缓存的结果,因此,重复子树不重新计算。

    • 例如:的如果你正在计算Fibonacci序列 FIB(100),你只需要调用这一点,它会调用 FIB(100)= FIB(99)+ FIB(98),它会叫 FIB(99)= FIB(98)+ FIB(97),...等等,这会叫 FIB(2)= FIB(1)+ FIB(0)= 1 + 0 = 1 。然后,它会最终解决 FIB(3)= FIB(2)+ FIB(1),但它并不需要重新计算 FIB(2 ),因为我们缓存了。
    • 在这开始于树的顶部,并评估的子从树叶/子树备份到根。
  • 制表 - 你也可以把动态规划作为表填充算法(虽然通常是多层面的,这个'表'可能在非常罕见的情况下,非欧几里得几何)。这就好比记忆化,但更活跃,涉及到一个额外的步骤:你必须选择,时间提前,正是以何种顺序,你会做你的计算(这并不意味着该命令必须是静态的,但你有更大的灵活性比记忆化)。

    • 例如:的如果做斐波那契,你选择来计算该订单号: FIB(2) FIB(3) FIB(4) ...每一个缓存值,以便可以更容易地计算出下一次的。你也可以把它看作是填补了一个表(缓存的另一种形式)。
    • 在我个人没有听到这个词'制表'很多,但它是一个非常不错的词。有些人认为这种动态规划。
    • 之前运行算法,程序员认为整个树,然后写入一个算法来评估子问题在朝向根特定的顺序,通常填充在表中。

(在最一般情况下,在动态规划我想说的程序员认为整个树,然后写一个算法,实现了评估的子策略(可优化任何属性,你想要的,时间复杂度通常为组合和空间复杂度)。该战略必须从某个地方开始,某些特定的子问题,也许是适应自己根据这些评估的结果,在最一般的意义上的动态规划,通常尝试缓存这些子问题(更普遍也或许是避免重访子问题......在图表)的各种数据结构的情况下,一个微妙的区别。很多时候,这些数据结构是其核心如数组或表。解子问题可以扔掉,如果我们不需要它们了。)

[previously这个答案做了一个关于自上而下VS自下而上的术语表;明显有2名为记忆化和制表的主要办法,可能是双射该等条款,虽然不完全。通用术语大多数人使用的仍然是动态规划,有的人说记忆化来指代动态规划的特定亚型。这个答案拒绝透露这是自顶向下和自底向上,直至社区能找到学术论文适当的参考。归根结底,要了解的区别,而不是术语是非常重要的。]


利与弊

易于编码

记忆化是很容易的code(通常可以写一个memoizer批注或包装功能,可自动会为你),而应该是接近你的第一道防线。制表的缺点是,你必须拿出一个排序。

递归

请注意,这两个顶向下和自底向上可以与递归或迭代表填充来实现,虽然它可能不自然。

实际的问题

使用记忆化,如果树是非常深的(如 FIB(10 ^ 6)),你会用完堆栈空间,因为每个延迟计算一定要放在栈中,你将有10 ^ 6人。

最优

这两种方法可能不是时间优化,如果为了你发生(或尝试)访问的子不是最优的,特别是如果有计算子问题(通常缓存将解决这个不止一种方法,但它是理论上的可能该缓存可能无法在一些异国情调的情况下)。记忆化将你的时间复杂度通常会添加到你的空间复杂度(如使用制表你有更多的自由扔掉算了一笔账,如使用制表与纤维蛋白原,您可以使用O(1)空间,但记忆化与纤维蛋白原使用O(N)堆栈空间)。

高级优化

如果你也做一个非常复杂的问题,你可能别无选择,只能做制表(或至少采取转向记忆化更积极的作用,你想让它去)。此外,如果你是在一个情况下优化是绝对重要,你必须优化,制表将允许你这样做优化其记忆化本来不会让你在一个正常的方式做。在我的愚见,在正常软件工程,无论是这两种情况真的来临了,所以我才刚刚使用记忆化(功能,它可以缓存其答复),除非事情(如堆栈空间)进行必要的表格。


更多复杂的例子

下面我们列举特别感兴趣的例子,这不只是一般的DP问题,但有趣的是区分记忆化和制表。例如,一种制剂可能比其它更容易,或有可能是一个优化其基本上需要制表:

  • 的算法来计算编辑距离[ 2 ],有趣的是一个不平凡例的二维表填充算法

I'm not sure that I understand the approach top down with memoization and bottom-up method correctly.

Bottom up: Is where you first look at the "smaller" subproblems and then solve the larger subproblems using the solution to the smaller problem.

Top down: Solve the problem in a natural manner and check if you have calculated the solution to the subproblem before.

I'm a little confused. Can someone explain it? And what is the difference?

解决方案

rev4: A very eloquent comment by user Sammaron has noted that perhaps this answer previously confused top-down and bottom-up. While originally this answer (rev3) and other answers said that "bottom-up is memoization" ("assume the subproblems"), it may be the inverse (that is, "top-down" may be "assume the subproblems" and "bottom-up" may be "compose the subproblems"). Previously I have read of memoization being a different kind of dynamic programming, as opposed to a subtype of dynamic programming, so was quoting that despite not subscribing to that viewpoint. I have rewritten this answer to be agnostic of the terminology until proper references can be found in the literature, and converted this answer to a community wiki. Please prefer academic sources. List of references: {Web: 1} {Literature: -}

Recap

Dynamic programming is all about ordering your computations in a way that you avoid recalculating duplicate work. You have a main problem (the root of your tree of subproblems), and subproblems (subtrees). The subproblems typically repeat and overlap.

For example, consider your favorite example of Fibonnaci. This is the full tree of subproblems, if we did a naive recursive call:

TOP of the tree
fib(4)
 fib(3)...................... + fib(2)
  fib(2)......... + fib(1)       fib(1)........... + fib(0)
   fib(1) + fib(0)   fib(1)       fib(1)              fib(0)
    fib(1)   fib(0)
BOTTOM of the tree

(In some other rare problems, this tree could be infinite in some branches, representing non-termination, and thus the bottom of the tree may be infinitely large. Furthermore in some problems, you might not know what the full tree looks like ahead of time, and thus you might need a strategy/algorithm to decide which subproblems to reveal.)


Memoization, Tabulation

There are at least two main techniques of dynamic programming, which are not mutually exclusive:

  • Memoization - This is a laissez-faire approach: You assume you have already computed all subproblems. You have no idea the optimal evaluation order. You typically perform a recursive call (or some iterative equivalent) from the root, and either hope you will get close to the optimal evaluation order, or you have a proof that you will get the optimal evaluation order. You ensure that the recursive call never recomputes a subproblem because you cache the results, and thus duplicate sub-trees are not recomputed.

    • example: If you are calculating the Fibonacci sequence fib(100), you would just call this, and it would call fib(100)=fib(99)+fib(98), which would call fib(99)=fib(98)+fib(97), ...etc..., which would call fib(2)=fib(1)+fib(0)=1+0=1. Then it would finally resolve fib(3)=fib(2)+fib(1), but it doesn't need to recalculate fib(2), because we cached it.
    • This starts at the top of the tree, and evaluates the subproblems from the leaves/subtrees back up towards the root.
  • Tabulation - You can also think of dynamic programming as a "table-filling" algorithm (though usually multidimensional, this 'table' may have non-Euclidean geometry in very rare cases). This is like memoization but more active, and involves one additional step: You must pick, ahead of time, exactly in which order you will do your computations (this is not to imply the order must be static, but that you have much more flexibility than memoization).

    • example: If doing fibonacci, you choose to calculate the numbers in this order: fib(2),fib(3),fib(4)... caching every value so you can compute the next ones more easily. You can also think of it as filling up a table (another form of caching).
    • I personally do not hear the word 'tabulation' a lot, but it's a very decent term. Some people consider this "dynamic programming".
    • Before running the algorithm, the programmer considers the whole tree, then writes an algorithm to evaluate the subproblems in a particular order towards the root, generally filling in a table.

(At its most general, in "dynamic programming" I would say the programmer considers the whole tree, then writes an algorithm that implements a strategy for evaluating subproblems (which optimizes whatever properties you want, usually a combination of time-complexity and space-complexity). The strategy must start somewhere, some particular subproblem, and perhaps adapts itself based on the results of those evaluations. In "dynamic programming" in the most general sense, you generally try to cache these subproblems (and more generally, also avoid revisiting subproblems... a subtle distinction perhaps in the case of graphs) in various data structures. Very often these data structures are at their core like arrays or tables. Solutions to subproblems can be thrown away if we don't need them anymore.)

[Previously this answer made a statement about the top-down vs bottom-up terminology; there are clearly two main approaches called Memoization and Tabulation that may be in bijection with those terms though not entirely. The general term most people use is still "Dynamic Programming" and some people say "Memoization" to refer to that particular subtype of dynamic programming. This answer declines to say which is top-down and bottom-up until the community can find proper references in academic papers. Ultimately it is important to understand the distinction rather than the terminology.]


Pros and cons

Ease of coding

Memoization is very easy to code (you can generally write a "memoizer" annotation or wrapper function that automatically does it for you), and should be your first line of approach. The downside of tabulation is that you have to come up with an ordering.

Recursiveness

Note that both top-down and bottom-up can be implemented with recursion or iterative table-filling, though it may not be natural.

Practical concerns

With memoization, if the tree is very deep (e.g. fib(10^6)), you will run out of stack space, because each delayed computation must be put on the stack, and you will have 10^6 of them.

Optimality

Either approach may not be time-optimal if the order you happen (or try to) visit subproblems is not optimal, specifically if there is more than one way to calculate a subproblem (normally caching would resolve this, but it's theoretically possible that caching might not in some exotic cases). Memoization will usually add on your time-complexity to your space-complexity (e.g. with tabulation you have more liberty to throw away calculations, like using tabulation with Fib lets you use O(1) space, but memoization with Fib uses O(N) stack space).

Advanced optimizations

If you are also doing a extremely complicated problems, you might have no choice but to do tabulation (or at least take a more active role in steering the memoization where you want it to go). Also if you are in a situation where optimization is absolutely critical and you must optimize, tabulation will allow you to do optimizations which memoization would not otherwise let you do in a sane way. In my humble opinion, in normal software engineering, neither of these two cases ever come up, so I would just just use memoization ("a function which caches its answers") unless something (such as stack space) makes tabulation necessary.


More complicated examples

Here we list examples of particular interest, that are not just general DP problems, but interestingly distinguish memoization and tabulation. For example, one formulation might be much easier than the other, or there may be an optimization which basically requires tabulation:

  • the algorithm to calculate edit-distance[2], interesting as a non-trivial example of a two-dimensional table-filling algorithm

这篇关于动态编程和记忆化:自下而上VS自上而下的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆