将alloca()用于可变长度数组是否比在堆上使用矢量更好? [英] Is using alloca() for variable length arrays better than using a vector on the heap?

查看:127
本文介绍了将alloca()用于可变长度数组是否比在堆上使用矢量更好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些使用可变长度数组(VLA)的代码,它可以在gcc和clang中很好地编译,但是不适用于MSVC 2015.

I have some code using a variable length array (VLA), which compiles fine in gcc and clang, but does not work with MSVC 2015.

class Test {
public:
    Test() {
        P = 5;
    }
    void somemethod() {
        int array[P];
        // do something with the array
    }
private:
    int P;
}

代码中似乎有两种解决方案:

There seem to be two solutions in the code:

  • using alloca(), taking the risks of alloca in account by making absolutely sure not to access elements outside of the array.
  • using a vector member variable (assuming that the overhead between vector and c array is not the limiting factor as long as P is constant after construction of the object)

该扇区将更易于移植(较少使用测试哪个编译器的#ifdef),但是我怀疑alloca()会更快.

The ector would be more portable (less #ifdef testing which compiler is used), but I suspect alloca() to be faster.

向量实现如下所示:

class Test {
public:
    Test() {
        P = 5;
        init();
    }
    void init() {
        array.resize(P);
    }
    void somemethod() {
        // do something with the array
    }
private:
    int P;
    vector<int> array;
}

另一个考虑因素:当我仅在函数外部更改P时,堆上的数组的重新分配速度是否比堆栈上的VLA还要快?

Another consideration: when I only change P outside of the function, is having a array on the heap which isn't reallocated even faster than having a VLA on the stack?

最大P约为400.

推荐答案

您可以并且应该使用一些动态分配的 std :: vector管理(如 Peter回答).您可以使用智能指针或普通的原始指针(newmalloc,....),而不要忘记释放它们(deletefree,....).请注意,堆分配可能比您想象的要快(实际上,在大多数情况下,当前笔记本电脑的分配时间要不到一微秒).

You could and probably should use some dynamically allocated heap memory, such as managed by a std::vector (as answered by Peter). You could use smart pointers, or plain raw pointers (new, malloc,....) that you should not forget to release (delete,free,....). Notice that heap allocation is probably faster than what you believe (practically, much less than a microsecond on current laptops most of the time).

有时您可以将分配移出某个内部循环,或仅偶尔进行分配(因此,对于类似realloc的事物,最好使用unsigned newsize=5*oldsize/4+10;unsigned newsize=oldsize+1;来使用,即具有一定的几何增长).如果不能使用向量,请确保保留单独的分配大小和使用的长度(如std::vector在内部所做的那样).

Sometimes you can move the allocation out of some inner loop, or grow it only occasionally (so for a realloc-like thing, better use unsigned newsize=5*oldsize/4+10; than unsigned newsize=oldsize+1; i.e. have some geometrical growth). If you can't use vectors, be sure to keep separate allocated size and used lengths (as std::vector does internally).

另一种策略是特殊情况下,小尺寸与大尺寸.例如对于少于30个元素的数组,请使用调用堆栈;对于较大的,请使用堆.

Another strategy would be to special case small sizes vs bigger ones. e.g. for an array less than 30 elements, use the call stack; for bigger ones, use the heap.

如果您坚持分配(使用 VLA s-它们是常见的明智的做法是调用堆栈上标准C ++ 11-或alloca的扩展名)将您的通话范围限制在几千字节.总调用堆栈被限制(例如,通常限制在大约一兆字节,或者在许多便携式计算机上为一小部分),以特定于实现的限制为限.在某些操作系统中,您可以提高该限制(另请参见 setrlimit(2)在Linux上)

If you insist on allocating (using VLAs -they are a commonly available extension of standard C++11- or alloca) on the call stack, be wise to limit your call frame to a few kilobytes. The total call stack is limited (e.g. often to about a megabyte or a few of them on many laptops) to some implementation specific limit. In some OSes you can raise that limit (see also setrlimit(2) on Linux)

在手动调整代码之前,请务必先进行基准测试.不要忘记启用编译器优化(例如,g++ -O2 -Wall缓存未命中通常比堆分配要昂贵得多.不要忘记开发人员的时间也要付出一些代价(这通常可以与累积的硬件成本相提并论).

Be sure to benchmark before hand-tuning your code. Don't forget to enable compiler optimization (e.g. g++ -O2 -Wall with GCC) before benchmarking. Remember that caches misses are generally much more expensive than heap allocation. Don't forget that developer's time also has some cost (which often is comparable to cumulated hardware costs).

请注意,使用静态变量或数据也会出现问题(不是重入,而不是

Notice that using static variable or data has also issues (it is not reentrant, not thread safe, not async-signal-safe -see signal-safety(7) ....) and is less readable and less robust.

这篇关于将alloca()用于可变长度数组是否比在堆上使用矢量更好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆