Data.MemoCombinators 是如何工作的? [英] How does Data.MemoCombinators work?

查看:12
本文介绍了Data.MemoCombinators 是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在查看 Data.MemoCombinators 但我真的看不出它的核心在哪里.

I've been looking at the source for Data.MemoCombinators but I can't really see where the heart of it is.

请向我解释所有这些组合器背后的逻辑是什么,以及它们如何在现实世界编程中加速您的程序的实际工作机制.

Please explain to me what the logic is behind all of these combinators and the mechanics of how they actually work to speed up your program in real world programming.

我正在寻找这个实现的细节,并且可以选择与其他 Haskell 记忆方法进行比较/对比.我了解记忆化是什么,并且我寻找它的一般工作原理的描述.

I'm looking for specifics for this implementation, and optionally comparison/contrast with other Haskell approaches to memoization. I understand what memoization is and am not looking for a description of how it works in general.

推荐答案

这个库是众所周知的记忆技术的直接组合.让我们从规范示例开始:

This library is a straightforward combinatorization of the well-known technique of memoization. Let's start with the canonical example:

fib = (map fib' [0..] !!)
    where
    fib' 0 = 0
    fib' 1 = 1
    fib' n = fib (n-1) + fib (n-2)

我将您所说的解释为您知道这如何以及为什么会起作用.所以我会专注于组合.

I interpret what you said to mean that you know how and why this works. So I'll focus on the combinatorization.

我们基本上是在尝试捕捉和概括 (map f [0..] !!) 的想法.这个函数的类型是(Int -> r) ->(Int -> r),这是有道理的:它从 Int -> 获取一个函数r 并返回相同函数的记忆版本.任何在语义上是身份并具有这种类型的函数都被称为Int 的记忆器"(甚至 id,它不会记忆).我们推广到这个抽象:

We are essentiallly trying to capture and generalize the idea of (map f [0..] !!). The type of this function is (Int -> r) -> (Int -> r), which makes sense: it takes a function from Int -> r and returns a memoized version of the same function. Any function which is semantically the identity and has this type is called a "memoizer for Int" (even id, which doesn't memoize). We generalize to this abstraction:

type Memo a = forall r. (a -> r) -> (a -> r)

所以一个 Memo a,一个 a 的记忆器,把一个函数从 a 取到任何东西,并返回一个语义相同的函数,它具有已被记忆(或未被记忆).

So a Memo a, a memoizer for a, takes a function from a to anything, and returns a semantically identical function that has been memoized (or not).

不同memoizers的想法是找到一种方法来枚举具有数据结构的域,将函数映射到它们上,然后索引数据结构.bool 就是一个很好的例子:

The idea of the different memoizers is to find a way to enumerate the domain with a data structure, map the function over them, and then index the data structure. bool is a good example:

bool :: Memo Bool
bool f = table (f True, f False)
    where
    table (t,f) True = t
    table (t,f) False = f

来自Bool 的函数等价于pair,除了pair 只会对每个组件求值一次(对于出现在lambda 之外的每个值也是如此).所以我们只映射到一对然后返回.关键是我们通过枚举域来提升函数的评估值高于参数的 lambda(这里是 table 的最后一个参数).

Functions from Bool are equivalent to pairs, except a pair will only evaluate each component once (as is the case for every value that occurs outside a lambda). So we just map to a pair and back. The essential point is that we are lifting the evaluation of the function above the lambda for the argument (here the last argument of table) by enumerating the domain.

记忆Maybe a 是一个类似的故事,除了现在我们需要知道如何为Just 情况记忆a.所以 Maybe 的 memoizer 将 a 的 memoizer 作为参数:

Memoizing Maybe a is a similar story, except now we need to know how to memoize a for the Just case. So the memoizer for Maybe takes a memoizer for a as an argument:

maybe :: Memo a -> Memo (Maybe a)
maybe ma f = table (f Nothing, ma (f . Just))
    where
    table (n,j) Nothing = n
    table (n,j) (Just x) = j x

图书馆的其余部分只是这个主题的变体.

The rest of the library is just variations on this theme.

它记忆整数类型的方式使用了比 [0..] 更合适的结构.这有点复杂,但基本上只是创建了一个无限树(用二进制表示数字以阐明结构):

The way it memoizes integral types uses a more appropriate structure than [0..]. It's a bit involved, but basically just creates an infinite tree (representing the numbers in binary to elucidate the structure):

1
  10
    100
      1000
      1001
    101
      1010
      1011
  11
    110
      1100
      1101
    111
      1110
      1111

因此,在树中查找数字的运行时间与其表示中的位数成正比.

So that looking up a number in the tree has running time proportional to the number of bits in its representation.

正如 sclv 所指出的,Conal 的 MemoTrie 库使用相同的底层技术,但使用类型类表示而不是组合器表示.我们同时独立发布了我们的库(实际上,在几个小时内!).在简单的情况下,Conal 更容易使用(只有一个函数,memo,它会根据类型确定要使用的备忘录结构),而我的更灵活,因为你可以做一些事情像这样:

As sclv points out, Conal's MemoTrie library uses the same underlying technique, but uses a typeclass presentation instead of a combinator presentation. We released our libraries independently at the same time (indeed, within a couple hours!). Conal's is easier to use in simple cases (there is only one function, memo, and it will determine the memo structure to use based on the type), whereas mine is more flexible, as you can do things like this:

boundedMemo :: Integer -> Memo Integer
boundedMemo bound f = z -> if z < bound then memof z else f z
   where
   memof = integral f

它只记住小于给定界限的值,这是实现项目欧拉问题之一所必需的.

Which only memoizes values less than a given bound, needed for the implementation of one of the project euler problems.

还有其他方法,例如在 monad 上公开一个开放的固定点函数:

There are other approaches, for example exposing an open fixpoint function over a monad:

memo :: MonadState ... m => ((Integer -> m r) -> (Integer -> m r)) -> m (Integer -> m r)

这允许更多的灵活性,例如.清除缓存、LRU 等.但是使用起来很麻烦,而且它对要记忆的函数施加了严格的限制(例如,没有无限左递归).我不相信有任何库可以实现这种技术.

Which allows yet more flexibility, eg. purging caches, LRU, etc. But it is a pain in the ass to use, and also it puts strictness constraints on the function to be memoized (e.g. no infinite left recursion). I don't believe there are any libraries that implement this technique.

这是否回答了您好奇的问题?如果不是,也许可以明确指出您感到困惑的几点?

Did that answer what you were curious about? If not, perhaps make explicit the points you are confused about?

这篇关于Data.MemoCombinators 是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆