pass-by-reference阻碍了尾部调用消除的gcc [英] Pass-by-reference hinders gcc from tail call elimination
问题描述
请参阅 BlendingTable :: create
和 BlendingTable :: print
。两者都具有相同形式的尾递归,但是 create
将被优化为一个循环, print
一个堆栈溢出。
See BlendingTable::create
and BlendingTable::print
. Both have the same form of tail recursion, but while create
will be optimized as a loop, print
will not and cause a stack overflow.
向下看看一个修补程序,我从一个gcc开发者的一个提示我的错误报告这个问题。
Go down to see a fix, which I got from a hint from one of the gcc devs on my bug report of this problem.
#include <cstdlib>
#include <iostream>
#include <memory>
#include <array>
#include <limits>
class System {
public:
template<typename T, typename... Ts>
static void print(const T& t, const Ts&... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(const Ts&... ts) {
print(ts..., '\n');
}
};
template<typename T, int dimension = 1>
class Array {
private:
std::unique_ptr<T[]> pointer;
std::array<int, dimension> sizes;
int realSize;
public:
Array() {}
template<typename... Ns>
Array(Ns... ns):
realSize(1) {
checkArguments(ns...);
create(1, ns...);
}
private:
template<typename... Ns>
static void checkArguments(Ns...) {
static_assert(sizeof...(Ns) == dimension, "dimension mismatch");
}
template<typename... Ns>
void create(int d, int n, Ns... ns) {
realSize *= n;
sizes[d - 1] = n;
create(d + 1, ns...);
}
void create(int) {
pointer = std::unique_ptr<T[]>(new T[realSize]);
}
int computeSubSize(int d) const {
if (d == dimension) {
return 1;
}
return sizes[d] * computeSubSize(d + 1);
}
template<typename... Ns>
int getIndex(int d, int n, Ns... ns) const {
return n * computeSubSize(d) + getIndex(d + 1, ns...);
}
int getIndex(int) const {
return 0;
}
public:
template<typename... Ns>
T& operator()(Ns... ns) const {
checkArguments(ns...);
return pointer[getIndex(1, ns...)];
}
int getSize(int d = 1) const {
return sizes[d - 1];
}
};
class BlendingTable : public Array<unsigned char, 3> {
private:
enum {
SIZE = 0x100,
FF = SIZE - 1,
};
public:
BlendingTable():
Array<unsigned char, 3>(SIZE, SIZE, SIZE) {
static_assert(std::numeric_limits<unsigned char>::max() == FF, "unsupported byte format");
create(FF, FF, FF);
}
private:
void create(int dst, int src, int a) {
(*this)(dst, src, a) = (src * a + dst * (FF - a)) / FF;
if (a > 0) {
create(dst, src, a - 1);
} else if (src > 0) {
create(dst, src - 1, FF);
} else if (dst > 0) {
create(dst - 1, FF, FF);
} else {
return;
}
}
void print(int dst, int src, int a) const {
System::print(static_cast<int>((*this)(FF - dst, FF - src, FF - a)), ' ');
if (a > 0) {
print(dst, src, a - 1);
} else if (src > 0) {
print(dst, src - 1, FF);
} else if (dst > 0) {
print(dst - 1, FF, FF);
} else {
System::printLine();
return;
}
}
public:
void print() const {
print(FF, FF, FF);
}
};
int main() {
BlendingTable().print();
return EXIT_SUCCESS;
}
更改系统
来自
class System {
public:
template<typename T, typename... Ts>
static void print(const T& t, const Ts&... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(const Ts&... ts) {
print(ts..., '\n');
}
};
到
class System {
public:
template<typename T, typename... Ts>
static void print(T t, Ts... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(Ts... ts) {
print(ts..., '\n');
}
};
magically允许gcc消除尾调用。
magically allows gcc to eliminate the tail calls.
为什么是否通过引用传递函数参数在gcc的行为中有这么大的区别?
Why does 'whether or not passing function arguments by reference' make such a big difference in gcc's behaviour? Semantically they both look the same to me in this case.
推荐答案
由@jxh注意到,演员 static_cast< int>()
创建一个临时的引用传递给 print
函数。没有这样的转换,尾递归被正确优化。
As it is noted by @jxh the cast static_cast<int>()
creates a temporary whose reference is passed to the print
function. Without such cast the tail recursion is optimized correctly.
问题与旧情况非常相似为什么gcc尾调用优化时gcc是?,解决方法可能类似于http://stackoverflow.com/a/31793391/4023446 。
The issue is very similar to the old case Why isn't g++ tail call optimizing while gcc is? and the workaround may be similar to http://stackoverflow.com/a/31793391/4023446.
仍然可以如果调用 System :: print
将使用 System
函数 SystemPrint
:
It is still possible to use System
with the arguments passed by reference if call to System::print
will be moved to a separate private helper function SystemPrint
:
class BlendingTable : public Array<unsigned char, 3> {
//...
private:
void SystemPrint(int dst, int src, int a) const
{
System::print(static_cast<int>((*this)(FF - dst, FF - src, FF - a)), ' ');
}
void print(int dst, int src, int a) const {
SystemPrint(dst, src, a);
if (a > 0) {
print(dst, src, a - 1);
} else if (src > 0) {
print(dst, src - 1, FF);
} else if (dst > 0) {
print(dst - 1, FF, FF);
} else {
System::printLine();
return;
}
}
// ...
}
现在尾调用优化工作(g ++(Ubuntu / Linaro 4.7.2-2ubuntu1)4.7.2使用优化选项-O2)和 print
不会导致堆栈溢出。
Now the tail call optimization works (g++ (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2 with the optimization option -O2) and the print
does not cause a stack overflow.
更新
我用其他编译器验证了它:
I verified it with other compilers:
- 原始代码没有任何更改是完全优化的clang ++苹果LLVM版本5.1(clang-503.0.40 )(基于LLVM 3.4svn)与-O1优化
- g ++(Ubuntu 4.8.4-2ubuntu1〜14.04)4.8.4无法执行TCO,即使没有强制转换或具有包装函数
SystemPrint
解决方法;这里只有使用System :: print
值的参数的解决方法。
- the original code without any change is perfectly optimized by clang++ Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) with -O1 optimization
- g++ (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4 fails to perform TCO even without the cast or with the wrapping function
SystemPrint
workaround; here only the workaround withSystem::print
arguments by values works.
所以,问题是非常具体的编译器版本。
So, the issue is very specific to compiler versions.
这篇关于pass-by-reference阻碍了尾部调用消除的gcc的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!