gfortran支持尾叫消除吗? [英] Does gfortran support tail call elimination?

查看:66
本文介绍了gfortran支持尾叫消除吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果gfortran确实消除了尾部调用,我做了这个小程序来测试:

I made this small program to test, if gfortran does tail call elimination:

program tailrec
implicit none

print *, tailrecsum(5, 0)

contains

recursive function tailrecsum (x, running_total) result (ret_val)
    integer, intent(in) :: x
    integer, intent(in) :: running_total
    integer             :: ret_val

    if (x == 0) then
        ret_val = running_total
        return
    end if
    ret_val = tailrecsum (x-1, running_total + x)
end function tailrecsum

end program

要进行检查,我使用-S选项对其进行了编译,以查看说明.这是tailrecsum函数的行:

To check, I compiled it with the -S option, to take a look at the instructions. Here the lines for the tailrecsum function:

tailrecsum.3429:
.LFB1:
    .cfi_startproc
    movl    (%rdi), %eax
    testl   %eax, %eax
    jne .L2
    movl    (%rsi), %eax
    ret
    .p2align 4,,10
    .p2align 3
.L2:
    subq    $24, %rsp
    .cfi_def_cfa_offset 32
    leal    -1(%rax), %edx
    addl    (%rsi), %eax
    leaq    8(%rsp), %rdi
    leaq    12(%rsp), %rsi
    movl    %edx, 8(%rsp)
    movl    %eax, 12(%rsp)
    call    tailrecsum.3429
    addq    $24, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc

最后,我看到了 call tailrecsum.3429 ,因此,认为没有尾声消除功能.当我使用 -O2 -O3 -foptimize-sibling-calls 时,也是如此.那么,gfortran不支持此功能吗?还是我的代码有问题?

At the end, I see call tailrecsum.3429 and therefore, think that there is no tail call elimination. This is the same also when I use -O2 or -O3 and -foptimize-sibling-calls. So, does gfortran not support this or is it a problem of my code?

推荐答案

它确实支持它.要避免很多非常细微的陷阱(它们会损害尾调用优化)是非常棘手的.

It does support it. It is quite tricky to avoid many very subtle traps which harm the tail call optimization.

如果按值传递参数,则编译器优化尾调用变得更加简单.在那种情况下,接收程序不需要指向它的临时对象(地址).

It becomes simpler for the compiler to optimize tail calls if you pass the arguments by value. In that case there is no temporary to which the receiving procedure needs to have a pointer (address).

实际上,此修改足以消除尾部调用并启用无限递归:

In fact, this modification is enough to get the tail call elimination and enable unlimited recursion:

recursive function tailrecsum (x, running_total) result (ret_val) bind(C)
    integer, value :: x
    integer, value :: running_total
    integer             :: ret_val

    if (x == 0) then
        ret_val = running_total
        return
    end if
    ret_val = tailrecsum (x-1, running_total + x)
end function tailrecsum

Gfortran不需要 bind(C),因为它将所有 value 都实现为类似C的值传递.英特尔确实需要它,因为它会创建一个临时文件并传递其地址.

Gfortran does not require the bind(C) because it implements all value as C-like pass by value. Intel does require it because it creates a temporary and passes its address.

在不同的体系结构上,细节可能有所不同,具体取决于谁负责清理什么内容.

The details may differ on different architectures, depending on who is responsible for the cleanup of what.

考虑此版本:

program tailrec
use iso_fortran_env

implicit none

integer(int64) :: acc, x

acc = 0

x = 500000000

call tailrecsum(x, acc)

print *, acc

contains

recursive subroutine tailrecsum (x, running_total)
    integer(int64), intent(inout) :: x
    integer(int64), intent(inout) :: running_total
    integer(int64)             :: ret_val

    if (x == 0)  return
    
    running_total = running_total + x
    x = x - 1
    call tailrecsum (x, running_total)
end subroutine tailrecsum



end program

经过5亿次迭代,它显然会在没有TCO的情况下炸毁堆栈,但事实并非如此:

With 500000000 iterations it would clearly blow the stack without TCO, but it does not:

> gfortran -O2 -frecursive tailrec.f90 
> ./a.out 
   125000000250000000

您可以使用 -fdump-tree-optimized 检查编译器更容易执行的操作.老实说,我什至不去尝试理解您的程序集输出.X86组装对我来说太深奥了,我的大脑只能处理某些RISC.

You can examine what the compiler does more easily using -fdump-tree-optimized. Honestly, I didn't even bother trying to understand your assembly output. X86 assembly is simply too esoteric for me, my simple brain can handle only certain RISCs.

您可以看到在原始版本中调用下一个迭代之后,仍有很多事情发生:

You can see that there is still a lot going on after the call to the next iteration in your original version:

  <bb 6>:
  _25 = _5 + -3;
  D.1931 = _25;
  _27 = _18 + _20;
  D.1930 = _27;
  ret_val_28 = tailrecsum (&D.1931, &D.1930);
  D.1930 ={v} {CLOBBER};
  D.1931 ={v} {CLOBBER};

  <bb 7>:
  # _29 = PHI <_20(5), ret_val_28(6)>

  <bb 8>:
  # _22 = PHI <_11(4), _29(7)>

  <bb 9>:
  # _1 = PHI <ret_val_7(3), _22(8)>
  return _1;

}

我不是GIMPLE的专家,但是 D.193x 操作肯定链接到为调用而放在堆栈中的临时表达式.

I am not an expert in GIMPLE, but the D.193x operations are definitely linked to the temporary expressions that are put on the stack for the call.

PHI 操作然后根据if语句(

The PHI operations then find which version of the return value will be actually returned based on which branch was actually taken in the if statement (https://gcc.gnu.org/onlinedocs/gccint/SSA.html).

就像我说的那样,有时将代码简化为正确的格式有时是棘手的,这对于gfortran进行尾部调用优化是可以接受的.

As I said it is sometimes tricky to simplify your code to the right form which is acceptable for gfortran to perform the tail call optimization.

这篇关于gfortran支持尾叫消除吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆