本征中的有效矩阵转置矩阵乘法 [英] Efficient matrix transpose matrix multiplication in Eigen

查看:125
本文介绍了本征中的有效矩阵转置矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以访问许多矩阵库,但是由于其编译时定义和包含SVD,因此我在本项目中使用的是Eigen.

I have access to a number of matrix libraries, but for this project I am using Eigen, due to its compile time definition and its inclusion of SVD.

现在,我正在执行以下操作:

Now, I am doing the following operation:

Eigen::Matrix<double,M,N> A;     // populated in the code

Eigen::Matrix<double,N,N> B = A.transpose() * A;

据我了解,这将复制A并形成转置,然后再与A相乘.此操作是在相对较小的矩阵(M = 20-30,N = 3)上执行的,但是每秒执行数百万次,这意味着它必须尽可能快.

As I understand, this makes a copy of A and forms the transpose, which is multiplied by A again. This operation is being performed on relatively small matrices (M=20-30,N=3), but many millions of times per second, meaning it must be as fast as possible.

我读到使用以下命令会更快:

I read that using the following is faster:

B.noalias() = A.transpose() * A;

我可以编写自己的子例程,该例程接受A作为输入并填充B,但是我想知道是否存在一种使用最少周期数的有效的现有实现.

I could write my own subroutine that accepts A as an input and fills B, but I was wondering if there is an efficient, existing implementation that uses the least amount of cycles.

推荐答案

首先,由于Eigen依赖模板表达式,因此A.transpose()不会求值为临时值.

First of all, since Eigen relies on template expressions, A.transpose() does not evaluate into a temporary.

其次,在:

Matrix<double,N,N> B = A.transpose() * A;

Eigen知道B不能出现在表达式的右侧(因为在这里编译器调用B的构造函数),因此根本没有创建临时文件.这等效于:

Eigen knows that B cannot appear on the right hand side of the expression (because here the compiler calls the constructor of B), and therefore, no temporary is created at all. This is equivalent to:

Matrix<double,N,N> B;             // declare first
B.noalias() = A.transpose() * A;  // eval later

最后,对于如此小的矩阵,我不希望使用B.selfadjointView().rankUpdate(A)会有所帮助(如kennytm注释中所建议).

Finally, for such small matrices, I don't expect that the use of B.selfadjointView().rankUpdate(A) will help (as suggested in kennytm comment).

另一方面,在N = 3的情况下,可能值得尝试这种懒惰的实现:

On the otherhand, with N=3, it might be worth trying the lazy implementation:

B = A.transpose().lazyProduct(A)

只是为了确定. Eigen的内置试探法可以选择最佳的产品实现,但是由于试探法必须简单,快速地进行评估,因此可能并非100%正确.

just to be sure. Eigen's has built-in heuristics to choose the best product implementation, but since the heuristic has to be simple and fast to evaluate, it might not be 100% right.

这篇关于本征中的有效矩阵转置矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆