将大型稀疏矩阵与其转置相乘的最佳方法是什么? [英] Which is the best way to multiply a large and sparse matrix with its transpose?

查看:143
本文介绍了将大型稀疏矩阵与其转置相乘的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前想将一个大的稀疏矩阵(〜1M x 200k)与其转置相乘.所得矩阵的值将为浮点数.

I currently want to multiply a large sparse matrix(~1M x 200k) with its transpose. The values of the resulting matrix would be in float.

  • 我尝试将矩阵加载到scipy的稀疏矩阵中,并将第一矩阵的每一行与第二矩阵相乘.乘法大约需要2个小时才能完成.

实现这种乘法的有效方法是什么?因为我在计算中看到了一种模式.

What is the efficient way to achieve this multiplication? Because I see a pattern in the computation.

  • 矩阵大而稀疏.
  • 矩阵与其转置的乘法.因此,结果矩阵将是对称的.

我想知道哪些库可以更快地实现计算.可以是Python,R,C,C ++或任何其他版本.

I would like to know what libraries can achieve the computation faster. It can be in Python, R, C, C++ or any other one.

推荐答案

我想您的主要需求是节省内存.首先,当您将矩阵与其转置相乘时,您不需要任何记忆:可以通过第一个矩阵直接访问其所有单元格(tA [i,j] = A [j,i]).保存了将近1/3的内存.

I suppose your main need is to save memory. First as you multiply a matrix with its transpose, you do not need any memeory for the transpose : all of its cells are directly accessible through first matrix (tA[i,j] = A[j,i]). Almost 1/3 of memory saved.

我看到计算时间也不能忽略.由于生成的矩阵是对称的,因此您只能计算一半,而直接存储另一半.节省了将近一半的计算时间.

I can see that computation time cannot be neglected too. As the resulting matrix will be symetric, you can compute only one half and directly store the other. Near half of computation time saved.

如果您确定初始矩阵确实很稀疏,并且可以希望结果矩阵也是如此,则可以将结果直接存储在scipy稀疏矩阵(COO格式)中: 只有三个列表可以存储非null值.

And if you are sure that you initial matrix is really sparse, and so can hope the resulting one will be too, you can directly store the result in a scipy sparse matrix, COO format : only three lists to store the non null values.

但是...我不知道有哪个库可以做到这一点,您将不得不使用自己喜欢的语言(可能是python,因为它谈到了scipy)自己编写代码.

But ... I do not know any library to do that and you will have to code it yourself in your prefered language (probably python as you spoke of scipy).

Python代码示例(矩阵= A [M] [N])

Python code example (matrix = A[M][N])

I = []
J = []
V = []
for i in range(M):
    for j in range(i:M) :
        X = 0.0
        for k in range(N):
            X += A[i ][k] * A[k][j]
        if X != 0.0 # or abs (X) > epsilon if floating point accuracy is a concern ... 
            I.append (i )
            J.append(j)
            V.append(X)
            I.append (j )
            J.append(i)
            V.append(X)

而I,J,V是通过以下方式生成密密的COO稀疏矩阵所需要的:

And I, J, V are what is needed for a scipy COO sparse matrix via :

RESULT = sparse.coo_matrix((V,(I,J)),shape=(N, N))

这篇关于将大型稀疏矩阵与其转置相乘的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆