Common Subexpression Elimination在什么情况下会影响Haskell程序的懒惰? [英] Under what circumstances could Common Subexpression Elimination affect the laziness of a Haskell program?

查看:115
本文介绍了Common Subexpression Elimination在什么情况下会影响Haskell程序的懒惰?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自 wiki.haskell.org


首先,常见的子表达式消除(CSE)意味着如果一个表达式出现在几个地方,代码将被重新排列,以便该表达式的值只计算一次。例如:

  foo x =(bar x)*(bar x)

可能被转换为

  foo x = let x'= bar x in x'* x'

因此,bar函数只被调用一次。 (如果bar是一个特别昂贵的函数,这可能会节省很多工作。)
GHC实际上并不像您预期​​的那样经常执行CSE。问题是,执行CSE会影响程序的严谨性/懒惰性。所以GHC确实做CSE,但只有在特定的情况下才能参见GHC手册。 (部分??)



长话短说:如果您关心CSE,请亲手做。


我想知道在什么情况下CSE会影响程序的严格性/懒惰性以及可能会产生什么样的影响。

解决方案

天真的CSE规则是

  e'[e,e]〜 >令x = e in e'[x,x]。 

即每当发生子表达式 e 时在表达式 e'中两次,我们使用let-binding来计算一次 e 。然而这导致了一些微不足道的空间泄漏。例如

  sum [1..n] + prod [1..n] 
O(1)
> >因为 sum prod 会执行尾部递进和等等等等,但会变成 O (n)当天真的CSE规则颁布时。对于 n 高的程序来说,这可能会很糟糕!

接下来的方法是让这个规则更具体一些,将它限制在一小组我们知道不会的问题中。我们可以首先更详细地列举一下天真规则的问题,这将为我们开发更好的CSE构成一系列优先事项: b e e'中可能相隔很远,从而导致 let x = e 绑定的长生命周期。
  • 分配一个以前可能没有的闭包。
  • 这可以创建一个无限制的闭包数量。 b
  • 有些情况下,闭包可能永远不会释放。



  •   let x = e in e'[e]〜>让x = e in e'[x] 

    这是比较保守的规则,但更安全。在这里,我们认识到 e 出现两次,但第一次出现在语法上支配第二个表达式,这意味着程序员已经引入了一个let绑定。我们可以安全地重复使用let-binding,并用 x 替换第二次出现 e

    句法控制的另一个例子:

      {x  - > e'[e]}〜>情况e的{x  - > e'[x]} 

    另一个:

     案例e of {
    构造器x0 x1 ... xn - >
    e'[e]
    }

    〜>

    案例e of {
    构造器x0 x1 ... xn - >
    e'[构造函数x0 x1 ... xn]
    }

    这些规则都利用了程序中现有的结构,以确保转换前后的空间使用动力学保持不变。他们比原来的CSE更保守,但他们也更安全。

    另见



    在一篇懒惰的FPL中充分讨论CSE,请阅读 Chitil(很容易理解)的1997年论文。要全面了解CSE在生产编译器中的工作方式,请参阅 GHC的CSE.hs模块,这得益于GHC编写长脚注的传统,非常全面。该模块中的注释代码比率不在图表中。还要注意这个文件的年龄(1993)!

    From wiki.haskell.org:

    First of all, common subexpression elimination (CSE) means that if an expression appears in several places, the code is rearranged so that the value of that expression is computed only once. For example:

    foo x = (bar x) * (bar x)
    

    might be transformed into

    foo x = let x' = bar x in x' * x'
    

    thus, the bar function is only called once. (And if bar is a particularly expensive function, this might save quite a lot of work.) GHC doesn't actually perform CSE as often as you might expect. The trouble is, performing CSE can affect the strictness/laziness of the program. So GHC does do CSE, but only in specific circumstances --- see the GHC manual. (Section??)

    Long story short: "If you care about CSE, do it by hand."

    I'm wondering under what circumstances CSE "affects" the strictness/laziness of the program and what kind of effect that could be.

    解决方案

    The naive CSE rule would be

    e'[e, e]  ~>  let x = e in e'[x, x].
    

    That is, whenever a subexpression e occurs twice in the expression e', we use a let-binding to compute e once. This however leads itself to some trivial space leaks. For example

    sum [1..n] + prod [1..n]
    

    is typically O(1) space usage in a lazy functional programming language like Haskell (as sum and prod would tail-recurse and blah blah blah), but would become O(n) when the naive CSE rule is enacted. This can be terrible for programs when n is high!

    The approach is then to make this rule more specific, restricting it to a small set of cases that we know won't have the problem. We can begin by more specifically enumerating the problems with the naive rule, which will form a set of priorities for us to develop a better CSE:

    • The two occurrences of e might be far apart in e', leading to a long lifetime for the let x = e binding.

    • The let-binding must always allocate a closure where previously there might not have been one.

    • This can create an unbound number of closures.

    • There are cases where the closure might never deallocate.

    Something better

    let x = e in e'[e]  ~>  let x = e in e'[x]
    

    This is a more conservative rule but is much safer. Here we recognize that e appears twice but the first occurrence syntactically dominates the second expression, meaning here that the programmer has already introduced a let-binding. We can safely just reuse that let-binding and replace the second occurrence of e with x. No new closures are allocated.

    Another example of syntactic domination:

    case e of { x -> e'[e] }  ~> case e of { x -> e'[x] }
    

    And yet another:

    case e of {
       Constructor x0 x1 ... xn ->
         e'[e]
    }
    
    ~>
    
    case e of {
       Constructor x0 x1 ... xn ->
         e'[Constructor x0 x1 ... xn]
    }
    

    These rules all take advantage of existing structure in the program to ensure that the kinetics of space usage remain the same before and after the transformation. They are much more conservative than the original CSE but they are also much safer.

    See also

    For a full discussion of CSE in a lazy FPL, read Chitil's (very accessible) 1997 paper. For a full treatment of how CSE works in a production compiler, see GHC's CSE.hs module, which is documented very thoroughly thanks to GHC's tradition of writing long footnotes. The comment-to-code ratio in that module is off the charts. Also note how old that file is (1993)!

    这篇关于Common Subexpression Elimination在什么情况下会影响Haskell程序的懒惰?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆