如何使用CUDA通过当前通过scipy.sparse.csc_matrix.dot实现的密集矢量积来加速稀疏矩阵? [英] How can I accelerate a sparse matrix by dense vector product, currently implemented via scipy.sparse.csc_matrix.dot, using CUDA?

查看：327 发布时间：2020/8/6 2:25:49 python matrix cuda gpu sparse-matrix

本文介绍了如何使用CUDA通过当前通过scipy.sparse.csc_matrix.dot实现的密集矢量积来加速稀疏矩阵?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的最终目标是可能通过使用支持CUDA的GPU来加速Python中矩阵向量乘积的计算.矩阵A约为15k x 15k且稀疏(密度〜0.05)，向量x为15k元素且密集，我正在计算Ax.我必须多次执行此计算，因此使其尽可能快是理想的.

My ultimate goal is to accelerate the computation of a matrix-vector product in Python, potentially by using a CUDA-enabled GPU. The matrix A is about 15k x 15k and sparse (density ~ 0.05), and the vector x is 15k elements and dense, and I am computing Ax. I have to perform this computation many times, so making it as fast as possible would be ideal.

我当前的非GPU优化"是将A表示为scipy.sparse.csc_matrix对象，然后简单地计算A.dot(x)，但我希望在带有NVIDIA的VM上加快这一速度附有GPU，并且在可能的情况下仅使用Python(即，不手工写出详细的内核功能).我已经成功使用cudamat库加速了密集的矩阵向量乘积，但对于稀疏情况却没有.对于在线稀疏案例，有一些建议，例如使用pycuda或scikit-cuda或anaconda的加速包，但信息不多，因此很难知道从哪里开始.

My current non-GPU "optimization" is to represent A as a scipy.sparse.csc_matrix object, and then simply computing A.dot(x), but I was hoping to speed this up on a VM with a couple NVIDIA GPUs attached, and using only Python if possible (i.e. not writing out the detailed kernel functions by hand). I’ve succeeded in accelerating dense matrix-vector products using the cudamat library, but not for the sparse case. There are a handful of suggestions for the sparse case online, such as using pycuda, or scikit-cuda, or anaconda’s accelerate package, but there’s not a ton of information so it’s hard to know where to begin.

我不需要非常详细的说明，但是如果有人以前已经解决了这个问题，并且可以提供一种大图"路线图以最简单的方式进行操作，或者可以加快稀疏GPU的运行速度，基于矩阵的矢量积将具有超过scipy的稀疏算法，这将非常有帮助.

I don’t need greatly detailed instructions, but if anyone has solved this before and could provide a "big picture" roadmap for the simplest way of doing this, or has an idea of the sort of speed up a sparse GPU-based matrix-vector product would have over scipy’s sparse algorithms, that would be very helpful.

如何使用CUDA通过当前通过scipy.sparse.csc_matrix.dot实现的密集矢量积来加速稀疏矩阵? [英] How can I accelerate a sparse matrix by dense vector product, currently implemented via scipy.sparse.csc_matrix.dot, using CUDA?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用CUDA通过当前通过scipy.sparse.csc_matrix.dot实现的密集矢量积来加速稀疏矩阵? [英] How can I accelerate a sparse matrix by dense vector product, currently implemented via scipy.sparse.csc_matrix.dot, using CUDA?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭