将零列添加到csr_matrix [英] Adding a column of zeroes to a csr_matrix

查看:241
本文介绍了将零列添加到csr_matrix的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MxN稀疏csr_matrix,我想在矩阵的右边添加几列,只有零.原则上,数组indptrindicesdata保持不变,因此我只想更改矩阵的尺寸.但是,这似乎没有实现.

I have an MxN sparse csr_matrix, and I'd like to add a few columns with only zeroes to the right of the matrix. In principle, the arrays indptr, indices and data keep the same, so I only want to change the dimensions of the matrix. However, this seems to be not implemented.

>>> A = csr_matrix(np.identity(5), dtype = int)
>>> A.toarray()
array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]])
>>> A.shape
(5, 5)
>>> A.shape = ((5,7))
NotImplementedError: Reshaping not implemented for csr_matrix.

水平堆叠零矩阵似乎也不起作用.

Also horizontally stacking a zero matrix does not seem to work.

>>> B = csr_matrix(np.zeros([5,2]), dtype = int)
>>> B.toarray()
array([[0, 0],
       [0, 0],
       [0, 0],
       [0, 0],
       [0, 0]])
>>> np.hstack((A,B))
array([ <5x5 sparse matrix of type '<type 'numpy.int32'>'
    with 5 stored elements in Compressed Sparse Row format>,
       <5x2 sparse matrix of type '<type 'numpy.int32'>'
    with 0 stored elements in Compressed Sparse Row format>], dtype=object)

这是我最终要实现的目标.是否可以快速重塑我的csr_matrix而不复制其中的所有内容?

This is what I want to achieve eventually. Is there a quick way to reshape my csr_matrix without copying everything in it?

>>> C = csr_matrix(np.hstack((A.toarray(), B.toarray())))
>>> C.toarray()
array([[1, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0]])

推荐答案

您想要做的实际上不是numpy或scipy理解为重塑的东西.但是对于您的特定情况,您可以创建一个新的CSR矩阵,而无需复制它们,就可以重用原始的dataindicesindptr:

What you want to do isn't really what numpy or scipy understand as a reshape. But for your particular case, you can create a new CSR matrix reusing the data, indices and indptr from your original one, without copying them:

import scipy.sparse as sps

a = sps.rand(10000, 10000, density=0.01, format='csr')

In [19]: %timeit sps.csr_matrix((a.data, a.indices, a.indptr),
...                             shape=(10000, 10020), copy=True)
100 loops, best of 3: 6.26 ms per loop

In [20]: %timeit sps.csr_matrix((a.data, a.indices, a.indptr),
...                             shape=(10000, 10020), copy=False)
10000 loops, best of 3: 47.3 us per loop

In [21]: %timeit sps.csr_matrix((a.data, a.indices, a.indptr),
...                             shape=(10000, 10020))
10000 loops, best of 3: 48.2 us per loop

因此,如果您不再需要原始矩阵a,则由于默认值为copy=False,只需执行以下操作:

So if you no longer need your original matrix a, since the default is copy=False, simply do:

a = sps.csr_matrix((a.data, a.indices, a.indptr), shape=(10000, 10020))

这篇关于将零列添加到csr_matrix的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆