GHC Haskell 什么时候自动记忆? [英] When is memoization automatic in GHC Haskell?

查看:31
本文介绍了GHC Haskell 什么时候自动记忆?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不明白为什么 m1 显然被记忆而 m2 不在以下内容中:

m1 = ((过滤奇数 [1..]) !!)m2 n = ((过滤奇数 [1..]) !! n)

m1 10000000 在第一次调用时大约需要 1.5 秒,在后续调用中只需要一小部分(大概它缓存了列表),而 m2 10000000 总是花费相同的时间(每次调用重建列表).知道发生了什么吗?关于 GHC 是否以及何时记忆功能是否有任何经验法则?谢谢.

解决方案

GHC 不记忆函数.

然而,它确实在每次输入其周围的 lambda 表达式时最多计算一次代码中的任何给定表达式,或者如果它处于顶层,则最多计算一次.当您在示例中使用语法糖时,确定 lambda 表达式的位置可能有点棘手,因此让我们将它们转换为等效的脱糖语法:

m1' = (!!)(过滤奇数 [1..])——注意:见下文!m2' = 
 ->(!!) (过滤奇数 [1..]) n

(注意:Haskell 98 报告实际上将像 (a %) 这样的左运算符部分描述为等效于  -> (%) ab,但 GHC将其脱糖为 (%) a.这些在技术上是不同的,因为它们可以通过 seq 区分.我想我可能已经提交了关于此的 GHC Trac 票.)>

鉴于此,您可以看到在 m1' 中,表达式 filterodd [1..] 不包含在任何 lambda 表达式中,因此它只会每次运行程序时计算一次,而在 m2' 中,每次输入 lambda 表达式时都会计算 filterodd [1..],即m2' 的每次调用.这解释了您所看到的时间差异.

<小时>

实际上,具有某些优化选项的某些 GHC 版本将共享比上述说明所指示的更多的值.这在某些情况下可能会出现问题.例如,考虑函数

f = x ->让 y = [1..30000000] in foldl' (+) 0 (y ++ [x])

GHC 可能会注意到 y 不依赖于 x 并将函数重写为

f = let y = [1..30000000] in x ->foldl' (+) 0 (y ++ [x])

在这种情况下,新版本的效率要低得多,因为它必须从存储 y 的内存中读取大约 1 GB,而原始版本将在恒定空间中运行并适合处理器的缓存.事实上,在 GHC 6.12.1 下,f 函数在编译时没有优化几乎是使用 -O2 编译时的两倍.

I can't figure out why m1 is apparently memoized while m2 is not in the following:

m1      = ((filter odd [1..]) !!)

m2 n    = ((filter odd [1..]) !! n)

m1 10000000 takes about 1.5 seconds on the first call, and a fraction of that on subsequent calls (presumably it caches the list), whereas m2 10000000 always takes the same amount of time (rebuilding the list with each call). Any idea what's going on? Are there any rules of thumb as to if and when GHC will memoize a function? Thanks.

解决方案

GHC does not memoize functions.

It does, however, compute any given expression in the code at most once per time that its surrounding lambda-expression is entered, or at most once ever if it is at top level. Determining where the lambda-expressions are can be a little tricky when you use syntactic sugar like in your example, so let's convert these to equivalent desugared syntax:

m1' = (!!) (filter odd [1..])              -- NB: See below!
m2' = 
 -> (!!) (filter odd [1..]) n

(Note: The Haskell 98 report actually describes a left operator section like (a %) as equivalent to  -> (%) a b, but GHC desugars it to (%) a. These are technically different because they can be distinguished by seq. I think I might have submitted a GHC Trac ticket about this.)

Given this, you can see that in m1', the expression filter odd [1..] is not contained in any lambda-expression, so it will only be computed once per run of your program, while in m2', filter odd [1..] will be computed each time the lambda-expression is entered, i.e., on each call of m2'. That explains the difference in timing you are seeing.


Actually, some versions of GHC, with certain optimization options, will share more values than the above description indicates. This can be problematic in some situations. For example, consider the function

f = x -> let y = [1..30000000] in foldl' (+) 0 (y ++ [x])

GHC might notice that y does not depend on x and rewrite the function to

f = let y = [1..30000000] in x -> foldl' (+) 0 (y ++ [x])

In this case, the new version is much less efficient because it will have to read about 1 GB from memory where y is stored, while the original version would run in constant space and fit in the processor's cache. In fact, under GHC 6.12.1, the function f is almost twice as fast when compiled without optimizations than it is compiled with -O2.

这篇关于GHC Haskell 什么时候自动记忆?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆