BLAS矩阵逐矩阵转置乘法 [英] BLAS matrix by matrix transpose multiply

查看:295
本文介绍了BLAS矩阵逐矩阵转置乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须以A'A或更通用的A'DA形式计算一些乘积,其中A是通用的mxn矩阵,而D是对角的mxm矩阵.他们两个都是全职.即rank(A)=min(m,n).

I have to calculate some products in the form A'A or more general A'DA, where A is a general mxn matrix and D is a diagonal mxm matrix. Both of them are full rank; i.e.rank(A)=min(m,n).

我知道这样的对称乘积可以节省大量时间:鉴于A'A是对称的,您只需要计算乘积矩阵的对角线的下部或上部.这就增加了要计算的n(n+1)/2条目,大约是大型矩阵典型n^2的一半.

I know that you can save a substantial time is such symmetric products: given that A'A is symmetric, you only have to calculate the lower --or upper-- diagonal part of the product matrix. That adds to n(n+1)/2 entries to be calculated, which is roughly the half of the typical n^2 for large matrices.

这是我想利用的一大节省,而且我知道我可以在for循环内实现矩阵-矩阵乘法.但是,到目前为止,我一直在使用BLAS,它比我自己可以编写的任何for循环实现都要快得多,因为它可以优化缓存和内存管理.

This is a great saving that I want to exploit, and I know I can implement the matrix-matrix multiply within a for loop . However, so far I have been using BLAS, which is much faster than any for loop implementation that I could write by myself, since it optimizes cache and memory management.

是否有一种方法可以使用BLAS有效地计算A'A甚至A'DA? 谢谢!

Is there a way to efficiently compute A'A or even A'DA using BLAS? Thanks!

推荐答案

您正在寻找BLAS的dsyrk子例程.

You are look for dsyrk subroutine of BLAS.

如文档中所述:

SUBROUTINE dsyrk(UPLO,TRANS,N,K,ALPHA,A,LDA,BETA,C,LDC)

SUBROUTINE dsyrk(UPLO,TRANS,N,K,ALPHA,A,LDA,BETA,C,LDC)

DSYRK执行对称秩k运算之一

DSYRK performs one of the symmetric rank k operations

C := alpha*A*A**T + beta*C

C := alpha*A**T*A + beta*C

其中alpha和beta是标量,C是第一种情况的n×n对称矩阵,A是第二种情况的n×k矩阵,

where alpha and beta are scalars, C is an n by n symmetric matrix and A is an n by k matrix in the first case and a k by n matrix in the second case.

A'A的情况下,存储上三角形为:

In the case of A'A storing upper triangular is:

CALL dsyrk( 'U' , 'T' ,  N , M ,  1.0  , A , M , 0.0 , C , N )

对于A'DA,BLAS中没有直接等效项.但是,您可以在for循环中使用dsyr.

For the A'DA there is no direct equivalent in BLAS. However you can use dsyr in a for loop.

SUBROUTINE dsyr(UPLO,N,ALPHA,X,INCX,A,LDA)

SUBROUTINE dsyr(UPLO,N,ALPHA,X,INCX,A,LDA)

DSYR执行对称等级1操作

DSYR performs the symmetric rank 1 operation

A := alpha*x*x**T + A

其中alpha是实标量,x是n个元素向量,A是n x n对称矩阵.

where alpha is a real scalar, x is an n element vector and A is an n by n symmetric matrix.

do i = 1, M
    call dsyr('U',N,D(i,i),A(1,i),M,C,N)
end do

这篇关于BLAS矩阵逐矩阵转置乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆