如何减少大型模板的编译时内存占用量? [英] How can I reduce the compile-time memory footprint of large templates?

查看:50
本文介绍了如何减少大型模板的编译时内存占用量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含 large 个其他类声明的类.是否有可能以某种方式分散这些开销,以使嵌套类型的编译时内存消耗不会二次增加?如果愿意,我愿意在编译时间上花点时间,如果可以的话,我很乐意将其划分为不同的翻译单元.

Suppose I have a class which contains a large number of other class declarations. Is it possible to spread the cost of these out somehow so that compile-time memory consumption doesn't grow quadratically for nested types? I'm willing to take a hit on compile time if need be, and would gladly divide this out into different translation units if that were an option.

为尝试解决这个问题,我编写了以下程序,该程序说明了导致这些故障的代码的简化版本:

To try and come up with a solution to this I've written the following program that illustrates a simplified version of the kind of code that leads to these blowouts:

// Add T to the list of args of SeqWithArgs N number of times:
template <int N, typename T, typename SeqWithArgs>
struct Append;

template <int N, typename T, template <typename...> class Seq, typename... Args>   
struct Append<N, T, Seq<Args...>>                                                  
{                                                                                  
    using type = typename Append<N-1, T, Seq<Args..., T>>::type;                   
};                                                                                 

template <typename T, template<typename...> class Seq, typename... Args>           
struct Append<0, T, Seq<Args...>>                                                  
{                                                                                  
    using type = Seq<Args...>;                                                     
};                                                                                 

static constexpr const int N = 10;                                              

// Tuple containing N instances of int
using Small = typename Append<N, int, std::tuple<>>::type;

// Tuple containing N instances of Small
using Big = typename Append<N, Small, std::tuple<>>::type;

// Tuple containing N instances of Big
using Huge = typename Append<N, Big, std::tuple<>>::type;

int main()
{                                                                               
    Huge h;                                                                     
    return 0;                                                                   
}

正如Yakk严格指出的那样,这些 Append 操作效率极低.但是,在此代码的实际版本中,修改这些代码将需要对代码进行基本的重组.

As righly pointed out by Yakk, these Append operations are horribly inefficient. However, in the real version of this code modifying these would require a fundamental restructure of the code.

我将 N 从10更改为70,并使用 GCC 4.8.1 编译了我的程序.我还运行了 time -v make 以获得峰值驻留内存使用率.以下是仅使用默认标志的结果:

I varied N from 10 to 70 and got this result compiling my program using GCC 4.8.1. I also ran time -v make to obtain peak resident memory usage. Here are the results using just the default flags:

对我来说,这个结果似乎是多余的,不是因为形状(它应该是O(N ^ 3)且似乎遵循该形状),而是因为大小.看起来好像 Small 扩展了 Big 每个实例,而 Big 又扩展了 Huge 的每个实例.在较少使用大量模板的代码中,通常可以使用 extern 关键字声明某种类型的泛型专门化,因此可以避免这种嵌套扩展",但是这些是 types ,而不是;这种结构是否存在类似的东西?

This result seems excessive to me, not because of the shape (it is expected to be O(N^3) and seems to follow that shape), but because of the magnitude. It almost looks like Small is being expanded for every instantiation of Big, and Big in turn is being expanded for every instantiation of Huge. In less template-heavy code one would normally declare an generic specialisation of some type using the extern keyword and would therefore avoid this "nested expansion", but these are types, not values; does something similar exist for this kind of construct?

什么原因导致内存爆炸?如何在不更改 Small Big 类型的情况下减小内存占用 >和巨大?

What is the cause of this blowout in memory and what can I do to reduce this memory footprint without changing the type of Small, Big and Huge?

推荐答案

Append 生成 Seq< T> Seq< T,T> ... Seq< T,...,T> .这比它生成 Append< n-1,T,Seq< T>> 的事实要少,它是每个 N 的独特类型每次递归.

Append generates Seq<T> Seq<T,T> ... Seq<T,...,T>. Which is less of a problem than the fact it generates Append<n-1, T, Seq<T>>, which is a distinct and unique type for each N and each recursion.

它生成名称总长度为 O(n ^ 2 * | T |) N 个唯一类型,并且输出类型的大小为 O(n* | T |).

It generates N unique types of total name length O(n^2*|T|), and the output type is of size O(n*|T|).

然后我们将其链接起来.

We then chain this.

Big生成总大小为 O(n ^ 2 * O(n * | int |))的类型,最终类型大小为 O(n ^ 2 | int |)代码>.大小为 O(n ^ 2 * O(n ^ 2 | int |))= O(n ^ 4 | int |)的巨大类型.

Big generates types of total size O(n^2*O(n*|int|)), with end type size O(n^2|int|). Huge types of size O(n^2*O(n^2|int|))=O(n^4|int|).

生成了很多类型.

70 ^ 4 = 5000 ^ 2 = O(2500万)总类型长度.

70^4=5000^2=O(25 million) total type length.

我们可以在减少脑部死亡的情况下做得更好.只需三步即可完成.

We can do better with a less brain dead Append. Do it in three steps.

转录接受 t< Ts ...> template< class ...> class Seq 并将其复制 Ts ... 结束.

transcribe takes a t<Ts...> and a template<class...>class Seq and copies the Ts... over.

template<class...>struct t{using type=t;};
template<class src, template<class...>class dest>
struct transcribe;
template<class...Ts, template<class...>class dest>
struct transcribe<t<Ts...>,dest>{
  using type=dest<Ts...>;
};
template<class src, template<class...>class dest>
using transcribe_t=typename transcribe<src, dest>::type;

append 接受任意数量的 t< ...> s并将其附加.

append takes any number of t<...>s and appends them.

template<class... ts>
struct append;
template<>struct append<>{using type=t<>;};
template<class...Ts>struct append<t<Ts...>>{using type=t<Ts...>;};

template<class... Ts, class... Us, class... Zs>
struct append<t<Ts...>,t<Us...>,Zs....>:
  append<t<Ts...,Us...>,Zs...>
{};
template<class...ts>
using append_t=typename append<ts...>::type;

breaker 取一个无符号值 N 并将其分解为2的幂,输出 v v< 0,1,3,4> ( 2 ^ 0 + 2 ^ 1 + 2 ^ 3 + 2 ^ 4 )

breaker takes an unsigned value N and breaks it down to powers of two, outputting v<0,1,3,4> (2^0+2^1+2^3+2^4) for 26.

template<unsigned...>struct v{using type=v;};
template<unsigned X, class V=v<>, unsigned B=0, class=void>
struct breaker;
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<X, v<vs...>, B, typename std::enable_if<
  X&(1<<B)
>::type>:breaker<X&~(1<<B), v<vs...,B>, B+1>
{};
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<X, v<vs...>, B, typename std::enable_if<
  !(X&(1<<B))
>::type>:breaker<X&~(1<<B), v<vs...>, B+1>
{};
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<0, v<vs...>, B, void>
{
  using type=v<vs...>;
};
template<unsigned X>
using breaker_t=typename breaker<X>::type;

Build接受一个无符号值 N 和一个 T .它中断 N .然后,我们将二进制幂构建为 t< T,T,T,...,T> s.然后我们将它们附加.

Build takes an unsigned value N and a T. It Breaks the N. Then we build the power of twos into t<T,T,T,...,T>s. Then we append them.

然后我们获取输出并转录为 Seq< ...> .

Then we take the output and transcribe to Seq<...>.

这将生成 O(N * logN * logN)类型.因此对于较大的N可能更好.另外,生成的大多数类型都小而简单,这是一个优点.

This generates O(N*logN*logN) types. So for large N may be better. Plus most of the types generated are small and simple, which is a plus.

充其量可以将您的负载减少10倍.值得一试.

At best this reduces your load by a factor of 10. Worth a try.

这篇关于如何减少大型模板的编译时内存占用量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆