如何在CUDA / cublas中转置矩阵？ [英] How to transpose a matrix in CUDA/cublas?

查看：1838 发布时间：2017/3/4 12:22:34 c cuda gpu cublas

本文介绍了如何在CUDA / cublas中转置矩阵？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我在GPU上有一个尺寸为 A * B 的矩阵，其中 B ）是假定C风格的前导维度。在CUDA（或cublas）中有没有任何方法将此矩阵转置为FORTRAN风格，其中 A （行数）成为领先维度？

Say I have a matrix with a dimension of A*B on GPU, where B (number of columns) is the leading dimension assuming a C style. Is there any method in CUDA (or cublas) to transpose this matrix to FORTRAN style, where A (number of rows) becomes the leading dimension?

如果在 host-> device 传输期间可以转置，但保持原始数据不变。

It is even better if it could be transposed during host->device transfer while keep the original data unchanged.

推荐答案

CUDA SDK包含矩阵转置，您可以看到

The CUDA SDK includes a matrix transpose, you can see here examples of code on how to implement one, ranging from a naive implementation to optimized versions.

例如：

Naïvetranspose

__global__ void transposeNaive(float *odata, float* idata,
int width, int height, int nreps)
{
    int xIndex = blockIdx.x*TILE_DIM + threadIdx.x;
    int yIndex = blockIdx.y*TILE_DIM + threadIdx.y;
    int index_in = xIndex + width * yIndex;
    int index_out = yIndex + height * xIndex;

    for (int r=0; r < nreps; r++)
    {
        for (int i=0; i<TILE_DIM; i+=BLOCK_ROWS)
        {
          odata[index_out+i] = idata[index_in+i*width];
        }
    }
}

像talonmies可以指定是否希望在cublas矩阵运算中将矩阵作为转置操作例如：对于cublasDgemm（）其中C = a * op（A）* op（B）+ b * C，假设您要操作A转置（A ^ T），对于您可以指定的参数（'N'正常或'T'转置）

Like talonmies had point out you can specify if you want operate the matrix as transposed or not, in cublas matrix operations eg.: for cublasDgemm() where C = a * op(A) * op(B) + b * C, assuming you want to operate A as transposed (A^T), on the parameters you can specify if it is ('N' normal or 'T' transposed)

这篇关于如何在CUDA / cublas中转置矩阵？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在CUDA / cublas中转置矩阵？ [英] How to transpose a matrix in CUDA/cublas?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

如何在CUDA / cublas中转置矩阵？ [英] How to transpose a matrix in CUDA/cublas?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭