BLAS矩阵逐矩阵转置乘法 [英] BLAS matrix by matrix transpose multiply
问题描述
我必须以A'A
或更通用的A'DA
形式计算一些乘积,其中A
是通用的mxn
矩阵,而D
是对角的mxm
矩阵.他们两个都是全职.即rank(A)=min(m,n)
.
I have to calculate some products in the form A'A
or more general A'DA
, where A
is a general mxn
matrix and D
is a diagonal mxm
matrix. Both of them are full rank; i.e.rank(A)=min(m,n)
.
我知道这样的对称乘积可以节省大量时间:鉴于A'A
是对称的,您只需要计算乘积矩阵的对角线的下部或上部.这就增加了要计算的n(n+1)/2
条目,大约是大型矩阵典型n^2
的一半.
I know that you can save a substantial time is such symmetric products: given that A'A
is symmetric, you only have to calculate the lower --or upper-- diagonal part of the product matrix. That adds to n(n+1)/2
entries to be calculated, which is roughly the half of the typical n^2
for large matrices.
这是我想利用的一大节省,而且我知道我可以在for
循环内实现矩阵-矩阵乘法.但是,到目前为止,我一直在使用BLAS,它比我自己可以编写的任何for
循环实现都要快得多,因为它可以优化缓存和内存管理.
This is a great saving that I want to exploit, and I know I can implement the matrix-matrix multiply within a for
loop . However, so far I have been using BLAS, which is much faster than any for
loop implementation that I could write by myself, since it optimizes cache and memory management.
是否有一种方法可以使用BLAS有效地计算A'A
甚至A'DA
?
谢谢!
Is there a way to efficiently compute A'A
or even A'DA
using BLAS?
Thanks!
推荐答案
您正在寻找BLAS的dsyrk
子例程.
You are look for dsyrk
subroutine of BLAS.
如文档中所述:
SUBROUTINE dsyrk(UPLO,TRANS,N,K,ALPHA,A,LDA,BETA,C,LDC)
SUBROUTINE dsyrk(UPLO,TRANS,N,K,ALPHA,A,LDA,BETA,C,LDC)
DSYRK执行对称秩k运算之一
DSYRK performs one of the symmetric rank k operations
C := alpha*A*A**T + beta*C
,
或
C := alpha*A**T*A + beta*C
,
其中alpha和beta是标量,C是第一种情况的n×n对称矩阵,A是第二种情况的n×k矩阵,
where alpha and beta are scalars, C is an n by n symmetric matrix and A is an n by k matrix in the first case and a k by n matrix in the second case.
在A'A
的情况下,存储上三角形为:
In the case of A'A
storing upper triangular is:
CALL dsyrk( 'U' , 'T' , N , M , 1.0 , A , M , 0.0 , C , N )
对于A'DA
,BLAS中没有直接等效项.但是,您可以在for循环中使用dsyr
.
For the A'DA
there is no direct equivalent in BLAS. However you can use dsyr
in a for loop.
SUBROUTINE dsyr(UPLO,N,ALPHA,X,INCX,A,LDA)
SUBROUTINE dsyr(UPLO,N,ALPHA,X,INCX,A,LDA)
DSYR执行对称等级1操作
DSYR performs the symmetric rank 1 operation
A := alpha*x*x**T + A
,
其中alpha是实标量,x是n个元素向量,A是n x n对称矩阵.
where alpha is a real scalar, x is an n element vector and A is an n by n symmetric matrix.
do i = 1, M
call dsyr('U',N,D(i,i),A(1,i),M,C,N)
end do
这篇关于BLAS矩阵逐矩阵转置乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!