本征程序中的性能瓶颈 [英] Performance bottleneck in Eigen program

查看:105
本文介绍了本征程序中的性能瓶颈的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为一个较大问题的一部分,在处理Eigen中的稀疏矩阵时,我遇到了性能瓶颈.

As a part of a larger problem, I'm running into a performance bottle neck when dealing with Sparse Matrices in Eigen.

我需要从稀疏矩阵(G)中的每个元素中减去一个浮点数(x),包括系数为零的位置.因此零个元素的值应为-x

I need to subtract a floating point number (x) from each element in a Sparse Matrix (G), including the positions where the coefficients are zero. So the zero elements should have a value -x

此刻我的操作方式如下:

The way I do this at the moment is as follows:

//calculate G
x=0.01;
for(int i=0;i<rows;i++){
   for (int j=0; j<cols; j++) {
       G.coeffRef(i, j) -= x;
   }
}

当G的大小很大时,此简单的计算将成为瓶颈.

When the size of G is large, this simple calculation is a bottleneck.

我还尝试将稀疏矩阵G转换为一个密集的矩阵,然后减去P(一个填充有x值的矩阵):

I've also tried to convert sparse matrix G into a dense one and subtracting P(a matrix filled with values x):

MatrixXd DenseG=MatrixXd(G);
x=0.01;
for(int i=0;i<rows;i++){
   for (int j=0; j<cols; j++) {
       DenseG(i, j) -= x;
   }
}

此方法快得多.但是,我只是想知道是否还有其他解决方法不涉及将G转换为密集的方法,如果矩阵非常大,则需要大量的内存.

This method is so much faster. However, I'm just wondering if there are other workarounds that does not involve converting G into a dense one, which require a lot of memory in the case of very large matrices.

推荐答案

当您从所有n^2元素中减去时,您的稀疏"计算实际上是一种密集的计算.主要区别在于,不必对大量的内存执行单个操作,而每次访问零元素时,您几乎必须为矩阵分配内存.通常,稀疏矩阵在稀疏时非常有效,并且对于大多数操作而言会产生大量开销.只需存储很少的元素,就可以平衡开销,因此,只需重复操作几次即可.

Your "sparse" calculation is effectively a dense one as you're subtracting from all n^2 elements. A major difference is that instead of having a single operation done on a swath of memory, you have to allocate memory for the matrix pretty much every time you access a zero element. In general, sparse matrices are efficient when they are sparse, and incur a lot of overhead for most operations. That overhead is balanced out by only having to store very few elements, and therefore, only repeat the operations a few times.

另一个可能的选择是利用 Eigen的惰性评估,但这取决于根据您的确切要求(您未在此处列出).

Another possible option is to take advantage of Eigen's lazy evaluation, but that kinda depends on your exact requirements, which you have not listed here.

这篇关于本征程序中的性能瓶颈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆