在 cuBLAS howto 中转置矩阵乘法 [英] Transpose matrix multiplication in cuBLAS howto

查看：35 发布时间：2022/1/10 15:40:26 cuda matrix-multiplication transpose blas cublas

本文介绍了在 cuBLAS howto 中转置矩阵乘法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题很简单:我有两个矩阵 A 和 B，它们是 M 乘 N，其中 M >> N.我想先对 A 进行转置，然后将其乘以 B (A^T *B)将其放入C中，即N乘N.我为A和B设置了所有内容，但是如何正确调用cublasSgemm而不返回错误的答案?

我知道 cuBlas 有一个 cublasOperation_t 枚举用于预先转换内容，但不知何故我并没有正确使用它.我的矩阵 A 和 B 以行优先顺序排列，即 [ row1 ][ row2 ][ row3 ]..... 在设备内存中.这意味着要将 A 解释为 A 转置，BLAS 需要知道我的 A 是按列优先顺序排列的.我当前的代码如下所示:

float *A, *B, *C;//将 A、B、C 初始化为设备数组，并用值填充它们//初始化 m = num_row_A, n = num_row_B, and k = num_col_A;//设置 lda = m, ldb = k, ldc = m;//阿尔法 = 1, 贝塔 = 0;//设置 cuBlas 句柄 ...cublasSgemm(句柄，CUBLAS_OP_T，CUBLAS_OP_N，m，n，k，&alpha，A，lda，B，ldb，&beta，C，ldc)；

我的问题:

我是否正确设置了 m、k、n?

lda、ldb、ldc 呢?

谢谢！

解决方案

因为 cuBLAS 总是假设矩阵存储在列优先.您可以先使用 cublas_geam() 将矩阵转置为列专业，或者p>

您可以将存储在行优先的矩阵 A 视为存储在列优先的新矩阵 AT.矩阵 AT 实际上是 A 的转置.对于 B 做同样的事情.然后你可以通过 C=AT * BT^T

计算存储在列中的矩阵 C

float* AT = A;浮动* BT = B;

前导维度是与存储相关的参数，无论是否使用转置标志CUBLAS_OP_T都不会改变.

lda = num_col_A = num_row_AT = N;ldb = num_col_B = num_row_BT = N;ldc = num_row_C = N;

cuBLAS GEMM例程中的

m和n是结果矩阵C的#rows和#cols，

m = num_row_C = num_row_AT = num_col_A = N;n = num_col_C = num_row_BT = num_col_B = N；

k是A^T和B的共同维度，

k = num_col_AT = num_row_B = M;

然后你可以通过

调用GEMM例程

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, m, n, k, &alpha, AT, lda, BT, ldb, &beta, C, ldc);

如果您希望矩阵 C 以行优先存储，您可以使用公式 CT = BT * AT^T by

计算以列优先存储的 CT

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, n, m, k, &alpha, BT, ldb, AT, lda, &beta, CT, ldc);

请注意，您不必交换 m 和 n，因为在这种情况下 C 是方阵.

The problem is simple: I have two matrices, A and B, that are M by N, where M >> N. I want to first take the transpose of A, and then multiply that by B (A^T * B) to put that into C, which is N by N. I have everything set up for A and B, but how do I call cublasSgemm properly without it returning the wrong answer?

I understand that cuBlas has a cublasOperation_t enum for transposing things beforehand, but somehow I'm not quite using it correctly. My matrices A and B are in row-major order, i.e. [ row1 ][ row2 ][ row3 ]..... in device memory. That means for A to be interpreted as A-transposed, BLAS needs to know my A is in column-major order. My current code looks like below:

float *A, *B, *C;
// initialize A, B, C as device arrays, fill them with values
// initialize m = num_row_A, n = num_row_B, and k = num_col_A;
// set lda = m, ldb = k, ldc = m;
// alpha = 1, beta = 0;
// set up cuBlas handle ...

cublasSgemm(handle, CUBLAS_OP_T, CUBLAS_OP_N, m, n, k, &alpha, A, lda, B, ldb, &beta, C, ldc);

My questions:

Am I setting up m, k, n correctly?

What about lda, ldb, ldc?

Thanks!

解决方案

Since cuBLAS always assume that the matrices are stored in column-major. You could either transpose your matrices first into colum-major by using cublas_geam(), or

You could treat your matrix A stored in row-major, as a new matrix AT stored in column-major. The matrix AT is actually the transpose of A. For B do the same thing. Then you could calculate matrix C stored in column-major by C=AT * BT^T

float* AT = A;
float* BT = B;

The leading dimension is a param related to the storage, which doesn't change no matter you use the transpose flag CUBLAS_OP_T or not.

lda = num_col_A = num_row_AT = N;
ldb = num_col_B = num_row_BT = N;
ldc = num_row_C = N;

m and n in the cuBLAS GEMM routine are the #rows and #cols of the result matrix C,

m = num_row_C = num_row_AT = num_col_A = N;
n = num_col_C = num_row_BT = num_col_B = N;

k is the common dimension of A^T and B,

k = num_col_AT = num_row_B = M;

Then you could invoke the GEMM routine by

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, m, n, k, &alpha, AT, lda, BT, ldb, &beta, C, ldc);

If you want the matrix C to be stored in row-major, you could calculate the CT stored in column-major with the formula CT = BT * AT^T by

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, n, m, k, &alpha, BT, ldb, AT, lda, &beta, CT, ldc);

Please note you don't have to swap m and n since C is a square matrix in this case.

这篇关于在 cuBLAS howto 中转置矩阵乘法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 cuBLAS howto 中转置矩阵乘法 [英] Transpose matrix multiplication in cuBLAS howto

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在 cuBLAS howto 中转置矩阵乘法 [英] Transpose matrix multiplication in cuBLAS howto

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭