Mathematica 中的尾调用优化? [英] Tail call optimization in Mathematica?

查看:45
本文介绍了Mathematica 中的尾调用优化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在制定另一个SO问题的答案时,我遇到了一些奇怪的行为关于 Mathematica 中的尾递归.

While formulating an answer to another SO question, I came across some strange behaviour regarding tail recursion in Mathematica.

Mathematica 文档 暗示 尾调用优化 可能会被执行.但我自己的实验给出了相互矛盾的结果.对比一下,例如下面两个表达式.第一个崩溃 7.0.1 内核,大概是由于堆栈耗尽:

The Mathematica documentation hints that tail call optimization might be performed. But my own experiments give conflicting results. Contrast, for example, the following two expressions. The first crashes the 7.0.1 kernel, presumably due to stack exhaustion:

(* warning: crashes the kernel! *)
Module[{f, n = 0},
  f[x_] := (n += 1; f[x + 1]);
  TimeConstrained[Block[{$RecursionLimit = Infinity}, f[0]], 300, n]
]

第二次运行完成,似乎利用尾调用优化返回有意义的结果:

The second runs to completion, appearing to exploit tail call optimization to return a meaningful result:

Module[{f, n = 0},
  f[x_] := Null /; (n += 1; False);
  f[x_] := f[x + 1];
  TimeConstrained[Block[{$IterationLimit = Infinity}, f[0]], 300, n]
]

两个表达式都定义了一个尾递归函数f.在第一个函数的情况下,Mathematica 显然认为复合语句的存在足以击败任何尾调用优化的机会.另请注意,第一个表达式由 $RecursionLimit 控制,第二个由 $IterationLimit 控制——这表明 Mathematica 对这两个表达式的处理方式不同.(注意:上面引用的 SO 答案有一个成功利用尾调用优化的不太做作的函数).

Both expressions define a tail recursive function f. In the case of the first function, Mathematica apparently regards the presence of a compound statement enough to defeat any chance of tail call optimization. Also note that the first expression is governed by $RecursionLimit and the second by $IterationLimit -- a sign that Mathematica is treating the two expressions differently. (Note: the SO answer referenced above has a less contrived function that successfully exploits tail call optimization).

那么,问题是:有人知道 Mathematica 在什么情况下执行递归函数的尾调用优化吗?最好参考 Mathematica 文档或其他 WRI 材料中的明确声明.也欢迎投机.

So, the question is: does anyone know the circumstances under which Mathematica performs tail-call optimization of recursive functions? A reference to a definitive statement in the Mathematica documentation or other WRI material would be ideal. Speculation is also welcome.

推荐答案

我可以总结根据我的个人经验得出的结论,但免责声明以下内容可能不是完全正确的解释.答案似乎在于 Mathematica 调用堆栈和传统调用堆栈之间的差异,这源于 Mathematica 模式定义的函数是真正的规则.所以,没有真正的函数调用.Mathematica 需要一个堆栈有一个不同的原因:因为正常的计算是从表达式树的底部发生的,它必须保留中间表达式,以防(子)表达式的更深和更深的部分由于规则应用而被替换(某些部分一个表达式从底部增长).对于在其他语言中定义我们称之为非尾递归函数的规则来说,情况尤其如此.所以,再一次,Mathematica 中的堆栈是中间表达式的堆栈,而不是函数调用.

I can summarize the conclusions I was led to by my personal experience, with a disclaimer that what follows may not be the entirely right explanation. The anwer seems to lie in the differences between Mathematica call stack and traditional call stacks, which originates from Mathematica pattern-defined functions being really rules. So, there are no real function calls. Mathematica needs a stack for a different reason: since normal evaluation happens from the bottom of an expression tree, it must keep intermediate expressions in case when deeper and deeper parts of (sub)expressions get replaced as a result of rule application (some parts of an expression grow from the bottom). This is the case, in particular, for rules defining what we'd call non tail-recursive functions in other languages. So, once again, the stack in Mathematica is a stack of intermediate expressions, not function calls.

这意味着,如果作为规则应用的结果,可以完全重写(子)表达式,则表达式分支不需要保留在表达式堆栈中.这可能就是 Mathematica 中所谓的尾调用优化——这就是为什么在这种情况下我们使用迭代而不是递归(这是规则应用程序和函数调用之间差异的一个很好的例子).f[x_]:=f[x+1] 之类的规则属于这种类型.但是,如果某些子表达式被重写,产生更多的表达式结构,则表达式必须存储在堆栈中.规则 f[x_/;×<5] := (n += 1; f[x + 1]) 属于这种类型,在我们回忆起 () 代表 CompoundExpression 之前,它有点隐藏[].示意性地,这里发生的是 f[1] ->CompoundExpression[n+=1, f[2]] ->CompoundExpression[n+=1,CompoundExpression[n+=1,f[3]]]->etc.尽管对 f 的调用每次都是最后一次,但它发生在完整的 CompoundExpression[] 执行之前,因此这仍然必须保留在表达式堆栈中.有人可能会争辩说,这是一个可以进行优化的地方,为 CompoundExpression 做一个例外,但这可能并不容易实现.

This means that if, as a result of rule application, an (sub)expression can be rewritten in its entirety, the expression branch need not be kept on the expression stack. This is probably what is referred as tail call optimization in Mathematica - and this is why in such cases we have iteration rather than recursion (this is one very good example of the differences between rule applications and function calls). Rules like f[x_]:=f[x+1] are of this type. If, however, some sub-expression get rewritten, producing more expression structure, then expression must be stored on the stack. The rule f[x_ /; x < 5] := (n += 1; f[x + 1]) is of this type, which is a bit hidden until we recall that ()stand for CompoundExpression[]. Schematically what happens here is f[1] -> CompoundExpression[n+=1, f[2]] -> CompoundExpression[n+=1,CompoundExpression[n+=1,f[3]]]->etc. Even though the call to f is the last every time, it happens before the full CompoundExpression[] executes, so this still must be kept on the expression stack. One could perhaps argue that this is a place where optimization could be made, to make an exception for CompoundExpression, but this is probably not easy to implement.

现在,为了说明我上面示意性描述的堆栈累积过程,让我们限制递归调用的次数:

Now, to illustrate the stack accumulation process which I schematically described above, let us limit the number of recursive calls:

Clear[n, f, ff, fff];
n = 0;
f[x_ /; x < 5] := (n += 1; f[x + 1]);

ff[x_] := Null /; (n += 1; False);
ff[x_ /; x < 5] := ff[x + 1];

fff[x_ /; x < 5] := ce[n += 1, fff[x + 1]];

跟踪评估:

In[57]:= Trace[f[1],f]
Out[57]= {f[1],n+=1;f[1+1],{f[2],n+=1;f[2+1],{f[3],n+=1;f[3+1],{f[4],n+=1;f[4+1]}}}}

In[58]:= Trace[ff[1],ff]
Out[58]= {ff[1],ff[1+1],ff[2],ff[2+1],ff[3],ff[3+1],ff[4],ff[4+1],ff[5]}

In[59]:= Trace[fff[1],fff]
Out[59]= {fff[1],ce[n+=1,fff[1+1]],{fff[2],ce[n+=1,fff[2+1]],{fff[3],ce[n+=1,fff[3+1]],   
{fff[4],ce[n+=1,fff[4+1]]}}}}

由此可以看出,ffff 的表达式堆栈是累积的(后者只是用来说明这是一个通用机制,ce[] 只是一些任意的头),但不适用于 ff,因为为了模式匹配的目的,ff 的第一个定义是一个规则尝试但不匹配,第二个定义完全重写 ff[arg_],并且不会生成需要进一步重写的更深的子部分.所以,底线似乎是你应该分析你的函数,看看它的递归调用是否会从底部增加计算的表达式.如果是,就 Mathematica 而言,它不是尾递归的.

What you can see from this is that the expression stack accumulates for f and fff (the latter used just to show that this is a general mechanism, with ce[] just some arbitrary head), but not for ff, because, for the purposes of pattern matching, the first definition for ff is a rule tried but not matched, and the second definition rewrites ff[arg_] in its entirety, and does not generate deeper sub-parts that need further rewriting. So, the bottom line seems that you should analyze your function and see if its recursive calls will grow the evaluated expression from the bottom or not. If yes, it is not tail-recursive as far as Mathematica is concerned.

如果不展示如何手动进行尾调用优化,我的回答将是不完整的.例如,让我们考虑 Select 的递归实现.我们将使用 Mathematica 链表使其合理高效,而不是玩具.下面是非尾递归实现的代码:

My answer would not be complete without showing how to do the tail call optimization manually. As an example, let us consider recursive implementation of Select. We will work with Mathematica linked lists to make it reasonably efficient rather than a toy. Below is the code for the non tail-recursive implementation:

Clear[toLinkedList, test, selrecBad, sel, selrec, selTR]
toLinkedList[x_List] := Fold[{#2, #1} &, {}, Reverse[x]];
selrecBad[fst_?test, rest_List] := {fst,If[rest === {}, {}, selrecBad @@ rest]};
selrecBad[fst_, rest_List] := If[rest === {}, {}, selrecBad @@ rest];
sel[x_List, testF_] := Block[{test = testF}, Flatten[selrecBad @@ toLinkedList[x]]]

我使用 Block 和 selrecBad 的原因是为了更容易使用 Trace.现在,这会破坏我机器上的堆栈:

The reason I use Block and selrecBad is to make it easier to use Trace. Now, this blows the stack on my machine:

Block[{$RecursionLimit = Infinity}, sel[Range[300000], EvenQ]] // Short // Timing

您可以跟踪小列表以了解原因:

You can trace on small lists to see why:

In[7]:= Trace[sel[Range[5],OddQ],selrecBad]

Out[7]= {{{selrecBad[1,{2,{3,{4,{5,{}}}}}],{1,If[{2,{3,{4,{5,{}}}}}==={},{},selrecBad@@{2,{3,{4, 
{5,{}}}}}]},{selrecBad[2,{3,{4,{5,{}}}}],If[{3,{4,{5,{}}}}==={},{},selrecBad@@{3,{4,{5, 
{}}}}],selrecBad[3,{4,{5,{}}}],{3,If[{4,{5,{}}}==={},{},selrecBad@@{4,{5,{}}}]},{selrecBad[4,
{5,{}}],If[{5,{}}==={},{},selrecBad@@{5,{}}],selrecBad[5,{}],{5,If[{}==={},{},selrecBad@@{}]}}}}}}

结果在列表中积累得越来越深.解决方案是不增加结果表达式的深度,实现这一点的一种方法是让 selrecBad 接受一个额外的参数,即累积结果的(链接)列表:

What happens is that the result gets accumulated deeper and deeper in the list. The solution is to not grow the depth of the resulting expression, and one way to achieve that is to make selrecBad accept one extra parameter, which is the (linked) list of accumulated results:

selrec[{fst_?test, rest_List}, accum_List] := 
    If[rest === {}, {accum, fst}, selrec[rest, {accum, fst}]];
selrec[{fst_, rest_List}, accum_List] := 
    If[rest === {}, accum, selrec[rest, accum]]

并相应地修改主函数:

selTR[x_List, testF_] := Block[{test = testF}, Flatten[selrec[toLinkedList[x], {}]]]

这将通过我们的功率测试就好了:

This will pass our power test just fine:

In[14]:= Block[{$IterationLimit= Infinity},selTR[Range[300000],EvenQ]]//Short//Timing

Out[14]= {0.813,{2,4,6,8,10,12,14,16,18,20,
<<149981>>,299984,299986,299988,299990,299992,299994,299996,299998,300000}}

(注意这里我们必须修改 $IterationLimit,这是一个好兆头).使用 Trace 揭示了原因:

(note that here we had to modify $IterationLimit, which is a good sign). And using Trace reveals the reason:

In[15]:= Trace[selTR[Range[5],OddQ],selrec]

Out[15]= {{{selrec[{1,{2,{3,{4,{5,{}}}}}},{}],If[{2,{3,{4,{5,{}}}}}==={},{{},1},selrec[{2,{3,{4, 
{5,{}}}}},{{},1}]],selrec[{2,{3,{4,{5,{}}}}},{{},1}],If[{3,{4,{5,{}}}}==={},{{},1},selrec[{3, 
{4,{5,{}}}},{{},1}]],selrec[{3,{4,{5,{}}}},{{},1}],If[{4,{5,{}}}==={},{{{},1},3},selrec[{4, 
{5,{}}},{{{},1},3}]],selrec[{4,{5,{}}},{{{},1},3}],If[{5,{}}==={},{{{},1},3},selrec[{5, 
{}},{{{},1},3}]],selrec[{5,{}},{{{},1},3}],If[{}==={},{{{{},1},3},5},selrec[{},{{{{},1},3},5}]]}}}

也就是说,这个版本不会累积中间表达式的深度,因为结果保存在一个单独的列表中.

which is, this version does not accumulate the depth of the intermediate expression, since the results are kept in a separate list.

这篇关于Mathematica 中的尾调用优化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆