在cuBLAS中转置矩阵乘法howto [英] Transpose matrix multiplication in cuBLAS howto

查看:574
本文介绍了在cuBLAS中转置矩阵乘法howto的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题很简单:我有两个矩阵A和B,它们是M乘N,其中M >> N.我想先取A的转置,然后乘以B(A ^ T * B)把它放到C中,N是N.我有一切为A和B设置,但是如何正确调用cublasSgemm,而不返回错误的答案?



我知道cuBlas有一个cublasOperation_t枚举事先转置的事情,但不知何故我不能正确使用它。我的矩阵A和B是以行为主的顺序,即在设备存储器中的[row1] [row2] [row3] .....。这意味着A被解释为A转置,BLAS需要知道我的A是列的主要顺序。我现在的代码如下:

  float * A,* B,* C; 
//初始化A,B,C作为设备数组,用值
填充它们//初始化m = num_row_A,n = num_row_B,k = num_col_A;
// set lda = m,ldb = k,ldc = m;
// alpha = 1,beta = 0;
//设置cuBlas句柄...

cublasSgemm(handle,CUBLAS_OP_T,CUBLAS_OP_N,m,n,k,& alpha,A,lda,B,ldb,& beta ,C,ldc);

我的问题:





谢谢!

p>

解决方案

由于cuBLAS总是假定矩阵存储在column-major中。您可以使用 cublas_geam()或将您的矩阵首先转换为colum-major, p>

您可以将矩阵A存储在row-major中,作为存储在column-major中的新矩阵AT。矩阵AT实际上是A的转置。对于B做同样的事情。然后可以通过 C = AT * BT ^ T



计算存储在column-major中的矩阵C float * AT = A;
float * BT = B;

前导维度是与存储相关的参数,不管您使用转置标志 CUBLAS_OP_T

  lda = num_col_A = num_row_AT = N ; 
ldb = num_col_B = num_row_BT = N;
ldc = num_row_C = N;

m 在cuBLAS GEMM例程中的是结果矩阵C的#rows和#cols。

  m = num_row_C = num_row_AT = num_col_A = N; 
n = num_col_C = num_row_BT = num_col_B = N;

k T和B,

  k = num_col_AT = num_row_B = M;然后你可以通过

调用GEMM例程




  cublasSgemm(handle,CUBLAS_OP_N,CUBLAS_OP_T,m,n,k,& alpha,AT,lda,BT,ldb,& beta,C,ldc); 

如果希望矩阵C存储在row-major中,具有公式 CT = BT * AT ^ T 的列大师

  cublasSgemm(handle,CUBLAS_OP_N,CUBLAS_OP_T,n,m,k,& alpha,BT,ldb,AT,lda,& beta,CT,ldc); 

请注意,您不必交换 m n ,因为在这种情况下C是一个方阵。


The problem is simple: I have two matrices, A and B, that are M by N, where M >> N. I want to first take the transpose of A, and then multiply that by B (A^T * B) to put that into C, which is N by N. I have everything set up for A and B, but how do I call cublasSgemm properly without it returning the wrong answer?

I understand that cuBlas has a cublasOperation_t enum for transposing things beforehand, but somehow I'm not quite using it correctly. My matrices A and B are in row-major order, i.e. [ row1 ][ row2 ][ row3 ]..... in device memory. That means for A to be interpreted as A-transposed, BLAS needs to know my A is in column-major order. My current code looks like below:

float *A, *B, *C;
// initialize A, B, C as device arrays, fill them with values
// initialize m = num_row_A, n = num_row_B, and k = num_col_A;
// set lda = m, ldb = k, ldc = m;
// alpha = 1, beta = 0;
// set up cuBlas handle ...

cublasSgemm(handle, CUBLAS_OP_T, CUBLAS_OP_N, m, n, k, &alpha, A, lda, B, ldb, &beta, C, ldc);

My questions:

Am I setting up m, k, n correctly?

What about lda, ldb, ldc?

Thanks!

解决方案

Since cuBLAS always assume that the matrices are stored in column-major. You could either transpose your matrices first into colum-major by using cublas_geam(), or

You could treat your matrix A stored in row-major, as a new matrix AT stored in column-major. The matrix AT is actually the transpose of A. For B do the same thing. Then you could calculate matrix C stored in column-major by C=AT * BT^T

float* AT = A;
float* BT = B;

The leading dimension is a param related to the storage, which doesn't change no matter you use the transpose flag CUBLAS_OP_T or not.

lda = num_col_A = num_row_AT = N;
ldb = num_col_B = num_row_BT = N;
ldc = num_row_C = N;

m and n in the cuBLAS GEMM routine are the #rows and #cols of the result matrix C,

m = num_row_C = num_row_AT = num_col_A = N;
n = num_col_C = num_row_BT = num_col_B = N;

k is the common dimension of A^T and B,

k = num_col_AT = num_row_B = M;

Then you could invoke the GEMM routine by

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, m, n, k, &alpha, AT, lda, BT, ldb, &beta, C, ldc);

If you want the matrix C to be stored in row-major, you could calculate the CT stored in column-major with the formula CT = BT * AT^T by

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, n, m, k, &alpha, BT, ldb, AT, lda, &beta, CT, ldc);

Please note you don't have to swap m and n since C is a square matrix in this case.

这篇关于在cuBLAS中转置矩阵乘法howto的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆