了解Haskell中的评估顺序 [英] Understanding order of evaluation in Haskell
问题描述
我试图了解Haskell中子句的计算方式.假设我们得到了这个玩具示例,其中bar,baz和bat在某处定义了函数:
func x = foo i j k
where
foo i j k = i + j + k
k = bat x
j = baz k
i = bar j
func x = foo i j k
行如何展开?它会评估为func x = foo(i(j(k)), j(k), k)
还是func x = foo(i, j, k)
之类的东西吗?
简介
我假设您打算编写以下代码:
func :: Int -> Int
func x = foo
where
foo = i + j + k
k = bat x
j = baz k
i = bar j
这样,它将键入check,并且最终在where
子句中定义的所有三个函数都将被调用.如果这不是您的意思,请继续阅读,因为我不仅会给您描述代码评估的方式,而且还会为您提供确定答案的方法.这可能是一个漫长的故事,但我希望它值得您花时间.
核心
对代码的评估完全取决于您对编译器的选择,但是我想您将使用GHC,如果这样,它将对您的代码进行多次转换,然后将其简化为机器代码.
首先,"where
子句" 将替换为"let
子句" .这样做是为了将Haskell语法简化为更简单的 Core 语法.核心与称为 lambda演算的数学理论足够相似,其最终评估将根据此坚实基础进行.此时,您的代码将如下所示:
func = λx ->
let { k = bat x } in
let { j = baz k } in
+
(+ (bar j) j)
k
如您所见,Haskell代码的where
子句中的一个函数定义在到达Core阶段时已完全消失(实际上是内联的),而其他函数则被重写为let
符号.二进制运算符(+)被重写为抛光符号以使其明确(现在很清楚,应该首先计算i + j
).所有这些转换都是在不更改代码含义的情况下进行的.
绘图机
然后,生成的 lambda表达式将被简化为有向图,并由 Spineless Tagless Graph机器执行.从某种意义上说,STG机的核心是图灵机的汇编程序,尽管前者是lambda表达式,而后者是命令式指令序列. (正如您现在所看到的,功能性语言和命令性语言之间的区别非常深.)STG机器可以通过以下方式将您提供的表达式转换为常规计算机上可执行的命令性指令.严格定义的操作语义-也就是说,对于Core的每个语法功能(其中只有约4个),都有一条命令式的汇编程序指令可以执行相同的操作,而Core程序将被翻译成这些作品的组合.
Core的操作语义的关键特征是 laziness .如您所知,Haskell是一种惰性语言.这意味着要计算的函数和该函数的值看起来相同:RAM中的字节序列.在程序启动时,所有内容都作为函数布置(准确地说是 closures ),但是一旦计算了函数的返回值,它将被放置在closure的位置,以便所有后续访问到内存中此位置将立即接收该值.换句话说,只有在需要时才计算一个值,并且仅一次.
正如我所说,Core中的一个表达式将变成相互依赖的有向计算图.例如:
如果您仔细观察,希望此图能使您想起我们开始使用的程序.请注意两个细节:
-
所有箭头最终都指向
x
,这与我们的想法一致,即提供x
足以评估func
. -
有时两个箭头指向同一框.这意味着此框的值将被评估一次,第二次我们将免费获得该值.
因此,STG机器将使用一些Core代码,并创建一个可执行文件,该可执行文件计算与图中的图形大致相似的图形值.
执行
现在,当我们进入图表时,很容易看到计算将会这样进行:
- 在调用
func
时,将接收x
的值并将其放在相应的框中.
计算 -
bat x
并将其放在框中. -
k
设置为与bat x
相同. (GHC在代码上运行的某些优化可能会放弃此步骤,但实际上let
子句要求将其值单独存储.)
计算 -
baz k
并将其放在框中. -
j
设置为与baz k
相同,与步骤6中的k
相同.bar j
被计算并放入框中. - 与根据步骤3和5预期的相反,
i
未设置为任何内容.正如我们在程序的Core清单中看到的那样,它已经过优化.
计算 -
+ (bar j) j
.j
已经被计算出来了,所以由于懒惰,这次将不会被调用. - 计算出最高值.同样,这次无需计算
bat x
,因为它是先前计算的并存储在右侧框中. - 现在,
func x
的值本身就是一个盒子,可以被调用方多次使用.
我要强调的是,这是您执行程序时发生的事情,而不是对其进行编译.
结语
就我所知,这就是故事.为了进一步说明,我请读者参考Simon Peyton Jones的作品:有关Graph机设计的文章,同时以最小的特性描述了GHC的所有内部工作. /p>
要查看GHC生成的Core,只需在编译时传递标志-ddump-simpl
.起初会伤害您的眼睛,但已经习惯了.
享受!
后记
正如@DanielWagner在评论中指出的那样,Haskell的懒惰还带来了一些其他后果,如果我们要分析一个人为的案件,我们应该考虑这些后果.具体来说:计算可能不需要来评估它所指向的某些盒子,甚至根本没有这些盒子中的任何一个.在这种情况下,这些框将保持不变和未评估,而计算完成并传递其结果实际上与下级框无关.此类功能的示例:f x = 3
.这将产生深远的影响:例如,如果无法像x
,则不使用 x
的函数将不会进入该循环完全没有.因此,有时希望详细了解哪些子计算将必须从给定的计算中启动,而哪些子计算可能不会.这样的复杂性比我准备在此答案中描述的要复杂得多,因此在此警告性提示下,我将结束.
I am trying to understand how where clauses evaluate in Haskell. Say we got this toy example where bar, baz and bat are defined functions somewhere:
func x = foo i j k
where
foo i j k = i + j + k
k = bat x
j = baz k
i = bar j
How does the line func x = foo i j k
expand? Does it evaluate to something like func x = foo(i(j(k)), j(k), k)
or func x = foo(i, j, k)
?
Intro
I will assume you meant to write this code:
func :: Int -> Int
func x = foo
where
foo = i + j + k
k = bat x
j = baz k
i = bar j
This way it will type check and all three functions you defined in the where
clause will eventually get called. If this is not what you meant, still do read on, as I will not only give you a depiction of the way your code evaluates, but also a method of determining the answer yourself. It may be a bit of a long story, but I hope it will worth your time.
Core
Evaluation of code absolutely depends on your choice of the compiler, but I suppose you will be using GHC, and if so, it will transform your code several times before reducing it to machine code.
First, "where
clauses" will be replaced with "let
clauses". This is done so as to reduce Haskell syntax to a simpler Core syntax. Core is similar enough to a math theory called lambda calculus for its eventual evaluation to proceed according to this solid foundation. At this point your code will look somewhat like this:
func = λx ->
let { k = bat x } in
let { j = baz k } in
+
(+ (bar j) j)
k
As you see, one of the function definitions from the where
clause of your Haskell code disappeard altogether by the time of its arrival to Core stage (actually, it was inlined), and the others were rewritten to let
notation. The binary operation (+) got rewritten to polish notation to make it unambiguous (it is now clear that i + j
should be computed first). All these conversions were performed without altering the meaning of the code.
Graph machine
Then, the resulting lambda expression will be reduced to a directed graph and executed by a Spineless Tagless Graph machine. In a sense, Core to STG machine is what assembler is to Turing machine, though the former is a lambda expression, while the latter a sequence of imperative instructions. (As you may see by now, the distinction between functional and imperative languages runs rather deep.) An STG machine will translate the expressions you give it to imperative instructions that are executable on a conventional computer, through a rigorously defined operational semantics -- that is, to every syntactic feature of Core (of which it only has about 4) there is a piece of imperative assembler instructions that performs the same thing, and a Core program will be translated to a combination of these pieces.
The key feature of the operational semantics of Core is its laziness. As you know, Haskell is a lazy language. What that means is that a function to be computed and the value of this function look the same: as a sequence of bytes in RAM. As the program starts, everything is laid out as functions (closures, to be precise), but once a function's return value is computed, it will be put in the place of the closure so that all further accesses to this location in memory would immediately receive the value. In other words, a value is computed only when it is needed, and only once.
As I said, an expression in Core will be turned to a directed graph of computations that depend on each other. For example:
If you look closely, I hope this graph will remind you of the program we started with. Please note two particulars about it:
All the arrows eventually lead to
x
, which is consistent with our idea that supplyingx
is enough to evaluatefunc
.Sometimes two arrows lead to the same box. That means the value of this box will be evaluated once, and the second time we shall have the value for free.
So, the STG machine will take some Core code and create an executable that computes the value of a graph more or less similar to the one in the picture.
Execution
Now, as we made it to the graph, it's easy to see that the computation will proceed thus:
- As
func
is called, the value ofx
is received and put in the corresponding box. bat x
is computed and put in a box.k
is set to be the same asbat x
. (This step will probably be dropped by some of the optimizations GHC runs on the code, but literally alet
clause requests that its value be stored separately.)baz k
is computed and put in a box.j
is set to be the same asbaz k
, the same as withk
in step 6.bar j
is computed and put in a box.- Contrary to what one would expect in the light of steps 3 and 5,
i
is not set to anything. As we saw in the listing of Core for our program, it was optimized away. + (bar j) j
is computed.j
is already computed, sobaz k
will not be called this time, thanks to laziness.- The topmost value is computed. Again, there is no need to compute
bat x
this time, as it was computed previously and stored in the right box. - Now, the value of
func x
is itself a box, ready to be used by the caller any number of times.
I would highlight that this is what's going to be happening at the time you execute the program, as opposed to compiling it.
Epilogue
That's the story, to the best of my knowledge. For further clarifications, I refer the reader to the works of Simon Peyton Jones: the book on the design of Haskell and the article on the design of Graph machine, together describing all the inner workings of GHC to the smallest peculiarity.
To review the Core generated by GHC, simply pass the flag -ddump-simpl
as you compile something. It will hurt your eyes at first, but one gets used to it.
Enjoy!
postscriptum
As @DanielWagner pointed out in the comments, the laziness of Haskell has some further consequences that we should have needed to consider were we to dissect a less contrived case. Specifically: a computation may not need to evaluate some of the boxes it points to, or even any of those boxes at all. In such case, these boxes will stay untouched and unevaluated, while the computation completes and delivers its result that is in actuality independent of the subordinate boxes anyway. An example of such a function: f x = 3
. This has far-reaching consequences: say, if x
were impossible to compute, as in "infinite loop", a function that does not use x
in the first place would not enter that loop at all. Thus, it is sometimes desirable to know in detail which sub-computations will necessarily be launched from a given computation and which may not. Such intricacies reach a bit farther than I'm prepared to describe in this answer, so at this cautionary note I will end.
这篇关于了解Haskell中的评估顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!