本征程序中的性能瓶颈 [英] Performance bottleneck in Eigen program
问题描述
作为一个较大问题的一部分,在处理Eigen中的稀疏矩阵时,我遇到了性能瓶颈.
As a part of a larger problem, I'm running into a performance bottle neck when dealing with Sparse Matrices in Eigen.
我需要从稀疏矩阵(G
)中的每个元素中减去一个浮点数(x
),包括系数为零的位置.因此零个元素的值应为-x
I need to subtract a floating point number (x
) from each element in a Sparse Matrix (G
), including the positions where the coefficients are zero. So the zero elements should have a value -x
此刻我的操作方式如下:
The way I do this at the moment is as follows:
//calculate G
x=0.01;
for(int i=0;i<rows;i++){
for (int j=0; j<cols; j++) {
G.coeffRef(i, j) -= x;
}
}
当G的大小很大时,此简单的计算将成为瓶颈.
When the size of G is large, this simple calculation is a bottleneck.
我还尝试将稀疏矩阵G转换为一个密集的矩阵,然后减去P(一个填充有x值的矩阵):
I've also tried to convert sparse matrix G into a dense one and subtracting P(a matrix filled with values x):
MatrixXd DenseG=MatrixXd(G);
x=0.01;
for(int i=0;i<rows;i++){
for (int j=0; j<cols; j++) {
DenseG(i, j) -= x;
}
}
此方法快得多.但是,我只是想知道是否还有其他解决方法不涉及将G转换为密集的方法,如果矩阵非常大,则需要大量的内存.
This method is so much faster. However, I'm just wondering if there are other workarounds that does not involve converting G into a dense one, which require a lot of memory in the case of very large matrices.
推荐答案
当您从所有n^2
元素中减去时,您的稀疏"计算实际上是一种密集的计算.主要区别在于,不必对大量的内存执行单个操作,而每次访问零元素时,您几乎必须为矩阵分配内存.通常,稀疏矩阵在稀疏时非常有效,并且对于大多数操作而言会产生大量开销.只需存储很少的元素,就可以平衡开销,因此,只需重复操作几次即可.
Your "sparse" calculation is effectively a dense one as you're subtracting from all n^2
elements. A major difference is that instead of having a single operation done on a swath of memory, you have to allocate memory for the matrix pretty much every time you access a zero element. In general, sparse matrices are efficient when they are sparse, and incur a lot of overhead for most operations. That overhead is balanced out by only having to store very few elements, and therefore, only repeat the operations a few times.
另一个可能的选择是利用 Eigen的惰性评估,但这取决于根据您的确切要求(您未在此处列出).
Another possible option is to take advantage of Eigen's lazy evaluation, but that kinda depends on your exact requirements, which you have not listed here.
这篇关于本征程序中的性能瓶颈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!