优化C ++二维数组 [英] Optimising C++ 2-D arrays

查看:178
本文介绍了优化C ++二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一种方法来表示C ++中的双精度二维数组(密集矩阵),并具有绝对最小的访问开销。

我在各种linux / unix机器和gcc版本上做了一些计时。一个STL向量向量,声明为:

  vector< vector< double> >矩阵(N,矢量<双>(N)); 

并通过矩阵[i] [j]访问

  double * matrix = new double [n]的访问速度比访问时缓慢5%到100% * N]; 

通过内联索引函数 matrix [index(i,j)] ,其中 index(i,j)的计算结果为i + n * j。其他安排没有STL的2-D数组的方法 - n个指向每行开始的指针数组,或者定义堆栈上的整个事物为常量大小 matrix [n] [n] - 以几乎与索引函数方法相同的速度运行。



最近的GCC版本(> 4.0)似乎能够编译STL vector-of-vectors在开启优化时与非STL代码效率几乎相同,但这有点依赖于机器。



我想如果可能的话使用STL,但必须选择最快的解决方案。没有人有任何使用GCC优化STL的经验吗?

解决方案

如果你使用GCC,编译器可以分析你的矩阵访问和在某些情况下更改内存中的顺序。神奇的编译器标志被定义为:

  -fipa-matrix-reorg 




执行矩阵展平和
转置。矩阵展平试图用
来替换一个m维矩阵,其中
是其等价的n维矩阵,
其中n <米这降低了访问矩阵的
元素所需的
间接的级别。第二个
优化是矩阵转置
,试图改变
矩阵的维数顺序,以便
提高缓存局部性。两个
优化都需要编程
标志。转换仅在
分析信息可用的情况下启用。


请注意,此选项不是由-O2或-O3启用的。你必须自己传递它。


I need a way to represent a 2-D array (a dense matrix) of doubles in C++, with absolute minimum accessing overhead.

I've done some timing on various linux/unix machines and gcc versions. An STL vector of vectors, declared as:

vector<vector<double> > matrix(n,vector<double>(n));

and accessed through matrix[i][j] is between 5% and 100% slower to access than an array declared as:

double *matrix = new double[n*n];

accessed through an inlined index function matrix[index(i,j)], where index(i,j) evaluates to i+n*j. Other ways of arranging a 2-D array without STL - an array of n pointers to the start of each row, or defining the whole thing on the stack as a constant size matrix[n][n] - run at almost exactly the same speed as the index function method.

Recent GCC versions (> 4.0) seem to be able to compile the STL vector-of-vectors to nearly the same efficiency as the non-STL code when optimisations are turned on, but this is somewhat machine-dependent.

I'd like to use STL if possible, but will have to choose the fastest solution. Does anyone have any experience in optimising STL with GCC?

解决方案

If you're using GCC the compiler can analyze your matrix accesses and change the order in memory in certain cases. The magic compiler flag is defined as:

-fipa-matrix-reorg

Perform matrix flattening and transposing. Matrix flattening tries to replace a m-dimensional matrix with its equivalent n-dimensional matrix, where n < m. This reduces the level of indirection needed for accessing the elements of the matrix. The second optimization is matrix transposing that attemps to change the order of the matrix's dimensions in order to improve cache locality. Both optimizations need fwhole-program flag. Transposing is enabled only if profiling information is avaliable.

Note that this option is not enabled by -O2 or -O3. You have to pass it yourself.

这篇关于优化C ++二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆