在python中并行组装稀疏矩阵 [英] Parallel assembly of a sparse matrix in python

查看:340
本文介绍了在python中并行组装稀疏矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用mpi4py并行组装一个非常大的稀疏矩阵.每个等级都会生成一个稀疏子矩阵(采用scipy的dok格式),需要将其放在非常大的矩阵中.到目前为止,如果每个等级产生一个包含索引和非零值的值的numpy数组(模仿coo格式),我就成功了.完成收集过程后,我可以从numpy数组中组装大型矩阵.最终的矩阵将以mtx格式的文件写入磁盘.

I'm trying to use mpi4py to assemble a very large sparse matrix in parallel. Each rank produces a sparse sub matrix (in scipy's dok format) that needs to be put in place in the very large matrix. So far I have succeeded if each rank produces a numpy array containing the indices and the values of the nonzero values (mimicking the coo format). After the gather procedure I can assemble the large matrix from the numpy arrays. The final matrix is to be written to disk as an mtx format file.

收集稀疏子矩阵最有效的方法是什么?也许,直接将它们作为参数传递给collect()?但是如何?

What is most efficient way of gathering the sparse submatrices? perhaps, passing them directly as arguments to gather()? but how?

这是我做的一个简化示例:它将对角子矩阵组装成一个大对角矩阵,在实际情况下,所得的大矩阵通常尺寸为500000x500000,而不是对角线.

Here's a simplified example of what I do: It assembles a large diagonal matrix out of diagonal submatrices, in the real case the resulting large matrix is typically 500000x500000 in size and not diagonal.

from mpi4py import MPI
from numpy import *
import time
import scipy.sparse as ss
import scipy.io as sio

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    tic = time.clock()      

# each rank generates a sparse matrix with N entries on the diagonal
N = 10000
tmp = ss.eye(N, format = 'dok') * rank

# extract indices and values
i,j = tmp.nonzero()
val = tmp.values()

# create the output array of each rank   
out = zeros((size(val),3))

# fill the output numpy array, shifting the indices according to the rank
out[:,0] = val
out[:,1] = i + rank * N
out[:,2] = j + rank * N

# gather all the arrays representing the submatrices
full_array = comm.gather(out,root=0)

if rank == 0:

    sp = shape(full_array)
    f = reshape(full_array, (sp[0]*sp[1],sp[2]))

    # this is the final result
    final_result = ss.csr_matrix( ( f[:,0], (f[:,1], f[:,2]) ) )
    sio.mmwrite('final.mtx', final_result)
    toc = time.clock()
    print 'Matrix assembled and written in', toc-tic, 'seconds'

推荐答案

对于什么是有价值的,使用三个元素列表可以很好地完成hpaulj的建议.这是一个工作示例:

For what is worth, using three element lists work pretty well as suggested by hpaulj. Here's a working example:

from mpi4py import MPI
from numpy import *
import scipy.sparse as ss
from timeit import default_timer as timer

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    tic = timer()      

# each rank generates a sparse matrix with N entries on the diagonal
N = 100000
block = ss.eye(N, format = 'coo')

# extract indices and values
out = [ block.data, block.row , block.col]
out[1] = out[1] + rank * N
out[2] = out[2] + rank * N

# gather all the arrays representing the submatrices
full_list = comm.gather(out,root=0)

if rank == 0:
    dat = concatenate([x[0] for x in full_list])
    row = concatenate([x[1] for x in full_list])
    col = concatenate([x[2] for x in full_list])
    final_result = ss.csr_matrix( ( dat, (row, col) ) )
    toc = timer()
    print 'Matrix assembled in', toc-tic, 'seconds'

使用coo矩阵比使用dok进行汇编的速度肯定要快得多.

The assembly is definitely much faster using coo matrices rather than dok.

这篇关于在python中并行组装稀疏矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆