CUDA / CUBLAS:访问数组中的元素 [英] CUDA/CUBLAS: Accessing elements in an array

查看:248
本文介绍了CUDA / CUBLAS:访问数组中的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为上一个问题的后续这里,我想实现以下循环,其是矩阵向量乘法,其中向量是来自矩阵Q的列,基于循环迭代器:



编辑:Q不能在之前填充

  for(无符号K = 0; K  // ...做一些东西
for(unsigned i = 0; i float sum = 0;
for(unsigned j = 0; j< N; j ++){
sum + = A [j] [i] * Q [j] [K]
}
v [i] = sum;
}
// ...做一些东西
//填充Q的下一列
}

其中数组的维度为:



A [N x N]



Q [N x(0.5N + 1)]



这些数组已经展平,以便与cublasSgemv我的问题是,是否可以使用cublasSgemv()告诉它在哪里开始访问d_Q,以及元素的增量(因为它是行主要C ++):



EDIT:与sizeof(float)相乘的记忆库访问增量。仍然不能工作,只要我能告诉。

  Niter = 0.5 * N + 1; 
for(unsigned K = 0; K cublasSgemv(handle,CUBLAS_OP_T,N,N,& alpha,d_A,N,(d_Q + sizeof(float)* K * (Niter)),(Niter),& beta,d_v,1);
}



我不认为可能索引d_Q,因为我不是获得任何结果



解决:@RobertCrovella的解决方案是我正在寻找的。感谢。

解决方案

可以通过您的扁平化 Q 你提出的方式。您对 Sgemv 的呼叫应为如下:

  cublasSgemv(handle,CUBLAS_OP_T,N,N,& alpha,d_A,N,(d_Q + K) (Niter),& beta,(d_v +(K * Niter)),1); 

指向 Q 的指针应指向第一个元素,因为你的矩阵是行主,这只是 d_Q + K (使用指针算术,而不是字节算术)。 Niter 是所讨论的列的连续元素之间的跨度(在元素中,而不是字节)。请注意,您编写的代码将覆盖一个矩阵向量乘以下一个的结果,因为您不通过 d_v 索引输出向量。所以我在 d_v 上添加了一些索引。



正如@JackOLantern指出的,应该也可以这样做在没有循环的情况下,通过调用 Sgemm

  cublasSgemm(handle,CUBLAS_OP_T,CUBLAS_OP_T N,Niter,N,& alpha,d_A, d_Q,(Niter),& beta,d_v,N); 

如果您的代码不能正常工作,请提供一个完整的可编译示例。 p>

As a follow up to a previous question here, I am trying to implement the following loop, which is a matrix-vector multiplication where the vector is a column from the matrix Q, based on the loop iterator :

EDIT: Q cannot be populated before hand but is populated with the progression of iterator K.

for (unsigned K=0;K<N;K++){   // Number of iterations loop
    //... do some stuff
    for (unsigned i=0; i<N; i++){
        float sum = 0;
        for (unsigned j=0; j<N; j++){
            sum += A[j][i]*Q[j][K];
        }
        v[i] = sum;
    }
    //... do some stuff
    // populate next column of Q
}

Where the dimensions of the arrays are:

A [N x N]

Q [N x (0.5N + 1)]

This arrays have been flattened in order to use them with cublasSgemv(). My question is, is it possible to use cublasSgemv() by telling it where to start accessing d_Q, and what the increment of the elements are (since it is row-major C++):

EDIT: multiplied memoery access increment with sizeof(float). Still doesn't work as far as i can tell.

Niter = 0.5*N + 1;
for (unsigned K=0;K<N;K++){
    cublasSgemv(handle, CUBLAS_OP_T, N, N, &alpha, d_A, N, (d_Q + sizeof(float)*K*(Niter)), (Niter), &beta, d_v , 1);
}

I don't think Its possible to index d_Q like that as I am not getting any results

SOLVED: the solution by @RobertCrovella is what I was looking for. Thanks.

解决方案

It is possible to index through your flattened Q matrix the way you propose. Your call to Sgemv should be as follows:

cublasSgemv(handle, CUBLAS_OP_T, N, N, &alpha, d_A, N, (d_Q + K), (Niter), &beta, (d_v+(K*Niter)) , 1);

The pointer to Q should point to the first element of the column in question, and since your matrix is row-major, this is just d_Q + K (using pointer arithmetic, not byte arithmetic). Niter is the stride (in elements, not bytes) between successive elements of the column in question. Note that your code as written would overwrite the results of one matrix-vector multiply with the next, since you are not indexing through d_v the output vector. So I added some indexing on d_v.

As @JackOLantern points out, it should also be possible to do this in a single step without your loop, by calling Sgemm:

cublasSgemm(handle, CUBLAS_OP_T, CUBLAS_OP_T N, Niter,  N, &alpha, d_A, N, d_Q, (Niter), &beta, d_v, N);

If your code is not working the way you expect, please provide a complete, compilable example.

这篇关于CUDA / CUBLAS:访问数组中的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆