在cuBLAS中转置矩阵乘法howto [英] Transpose matrix multiplication in cuBLAS howto

查看：574 发布时间：2017/3/4 12:54:57 cuda matrix-multiplication transpose blas cublas

本文介绍了在cuBLAS中转置矩阵乘法howto的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题很简单：我有两个矩阵A和B，它们是M乘N，其中M >> N.我想先取A的转置，然后乘以B（A ^ T * B）把它放到C中，N是N.我有一切为A和B设置，但是如何正确调用cublasSgemm，而不返回错误的答案？

我知道cuBlas有一个cublasOperation_t枚举事先转置的事情，但不知何故我不能正确使用它。我的矩阵A和B是以行为主的顺序，即在设备存储器中的[row1] [row2] [row3] .....。这意味着A被解释为A转置，BLAS需要知道我的A是列的主要顺序。我现在的代码如下：

  float * A，* B，* C; 
 //初始化A，B，C作为设备数组，用值
填充它们//初始化m = num_row_A，n = num_row_B，k = num_col_A; 
 // set lda = m，ldb = k，ldc = m; 
 // alpha = 1，beta = 0; 
 //设置cuBlas句柄... 
 
 cublasSgemm（handle，CUBLAS_OP_T，CUBLAS_OP_N，m，n，k，& alpha，A，lda，B，ldb，& beta ，C，ldc）;

我的问题：

谢谢！

解决方案

由于cuBLAS总是假定矩阵存储在column-major中。您可以使用 cublas_geam（）或将您的矩阵首先转换为colum-major， p>

您可以将矩阵A存储在row-major中，作为存储在column-major中的新矩阵AT。矩阵AT实际上是A的转置。对于B做同样的事情。然后可以通过 C = AT * BT ^ T

计算存储在column-major中的矩阵C

 float * AT = A; 
 float * BT = B;

前导维度是与存储相关的参数，不管您使用转置标志 CUBLAS_OP_T 。

  lda = num_col_A = num_row_AT = N ; 
 ldb = num_col_B = num_row_BT = N; 
 ldc = num_row_C = N;

m 在cuBLAS GEMM例程中的是结果矩阵C的#rows和#cols。

  m = num_row_C = num_row_AT = num_col_A = N; 
 n = num_col_C = num_row_BT = num_col_B = N;

k T和B，

  k = num_col_AT = num_row_B = M;然后你可以通过
调用GEMM例程

  cublasSgemm（handle，CUBLAS_OP_N，CUBLAS_OP_T，m，n，k，& alpha，AT，lda，BT，ldb，& beta，C，ldc）;

如果希望矩阵C存储在row-major中，具有公式 CT = BT * AT ^ T 的列大师

  cublasSgemm（handle，CUBLAS_OP_N，CUBLAS_OP_T，n，m，k，& alpha，BT，ldb，AT，lda，& beta，CT，ldc）;

请注意，您不必交换 m 和 n ，因为在这种情况下C是一个方阵。

 
The problem is simple: I have two matrices, A and B, that are M by N, where M >> N.  I want to first take the transpose of A, and then multiply that by B (A^T * B) to put that into C, which is N by N.  I have everything set up for A and B, but how do I call cublasSgemm properly without it returning the wrong answer?  

I understand that cuBlas has a cublasOperation_t enum for transposing things beforehand, but somehow I'm not quite using it correctly.  My matrices A and B are in row-major order, i.e. [  row1  ][   row2   ][   row3   ]..... in device memory.  That means for A to be interpreted as A-transposed, BLAS needs to know my A is in column-major order.  My current code looks like below:
float *A, *B, *C;
// initialize A, B, C as device arrays, fill them with values
// initialize m = num_row_A, n = num_row_B, and k = num_col_A;
// set lda = m, ldb = k, ldc = m;
// alpha = 1, beta = 0;
// set up cuBlas handle ...

cublasSgemm(handle, CUBLAS_OP_T, CUBLAS_OP_N, m, n, k, &alpha, A, lda, B, ldb, &beta, C, ldc);
My questions:

Am I setting up m, k, n correctly?

What about lda, ldb, ldc?

Thanks!
 解决方案 
Since cuBLAS always assume that the matrices are stored in column-major. You could either transpose your matrices first into colum-major by using cublas_geam(), or

You could treat your matrix A stored in row-major, as a new matrix AT stored in column-major. The matrix AT is actually the transpose of A. For B do the same thing. Then you could calculate matrix C stored in column-major by C=AT * BT^T
float* AT = A;
float* BT = B;
The leading dimension is a param related to the storage, which doesn't change no matter you use the transpose flag CUBLAS_OP_T or not.
lda = num_col_A = num_row_AT = N;
ldb = num_col_B = num_row_BT = N;
ldc = num_row_C = N;
m and n in the cuBLAS GEMM routine are the #rows and #cols of the result matrix C,
m = num_row_C = num_row_AT = num_col_A = N;
n = num_col_C = num_row_BT = num_col_B = N;
k is the common dimension of A^T and B,
k = num_col_AT = num_row_B = M;
Then you could invoke the GEMM routine by 
cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, m, n, k, &alpha, AT, lda, BT, ldb, &beta, C, ldc);
If you want the matrix C to be stored in row-major, you could calculate the CT stored in column-major with the formula CT = BT * AT^T by
cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, n, m, k, &alpha, BT, ldb, AT, lda, &beta, CT, ldc);
Please note you don't have to swap m and n since C is a square matrix in this case.

                        这篇关于在cuBLAS中转置矩阵乘法howto的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在cuBLAS中转置矩阵乘法howto [英] Transpose matrix multiplication in cuBLAS howto

问题描述

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

在cuBLAS中转置矩阵乘法howto [英] Transpose matrix multiplication in cuBLAS howto

问题描述

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭