将列放入空的稀疏矩阵 [英] putting column into empty sparse matrix

查看:106
本文介绍了将列放入空的稀疏矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将一个稀疏的柱状矩阵中的一列放入另一个(空)稀疏的柱状矩阵中. 玩具代码:

I want to put a column from one sparse columnar matrix into another (empty) sparse columnar matrix. Toy code:

import numpy as np
import scipy.sparse
row = np.array([0, 2, 0, 1, 2])
col = np.array([0, 0, 2, 2, 2])
data = np.array([1, 2, 4, 5, 6])
M=scipy.sparse.csc_matrix((data, (row, col)), shape=(3, 3))
E=scipy.sparse.csc_matrix((3, 3)) #empty 3x3 sparse matrix

E[:,1]=M[:,0]

但是我得到警告:

SparseEfficiencyWarning:更改csc_matrix的稀疏结构非常昂贵. lil_matrix效率更高.

SparseEfficiencyWarning: Changing the sparsity structure of a csc_matrix is >expensive. lil_matrix is more efficient.

此警告使我担心,在此过程中,矩阵将转换为另一种格式,然后又转换回csc,这效率不高.谁能确认并找到解决方案?

This warning makes me fear that in the process the matrix is converted to another format and then back to csc, which is not efficient. Can anyone confirm this and have a solution?

推荐答案

警告告诉您,以csc(或csr)格式矩阵设置新值的过程很复杂.这些格式不是为像这样的简单更改而设计的. lil格式旨在使这种更改变得快速而轻松,尤其是在一行中进行更改.

The warning is telling you that the process of setting new values in a csc (or csr) format matrix is complicated. Those formats aren't designed for easy changes like this. The lil format is designed to make that kind of change quick and easy, especially making changes in one row.

请注意,coo格式甚至都没有实现这种索引.

Note that the coo format doesn't even implement this kind of indexing.

它不会转换为lil并返回,但是实际上可能是一种更快的方法.我们必须做一些时间测试.

It isn't converting to lil and back, but that might actually be a faster way. We'd have to do some time tests.

In [679]: %%timeit E=sparse.csr_matrix((3,3))
     ...: E[:,1] = M[:,0]
     ...: 
/usr/lib/python3/dist-packages/scipy/sparse/compressed.py:730: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  SparseEfficiencyWarning)
1000 loops, best of 3: 845 µs per loop
In [680]: %%timeit E=sparse.csr_matrix((3,3))
     ...: E1=E.tolil()
     ...: E1[:,1] = M[:,0]
     ...: E=E1.tocsc()
     ...: 
The slowest run took 4.22 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.42 ms per loop

In [682]: %%timeit E=sparse.lil_matrix((3,3))
     ...: E[:,1] = M[:,0]
     ...: 
1000 loops, best of 3: 804 µs per loop
In [683]: %%timeit E=sparse.lil_matrix((3,3));M1=M.tolil()
     ...: E[:,1] = M1[:,0]
     ...: 
     ...: 
1000 loops, best of 3: 470 µs per loop

In [688]: timeit M1=M.tolil()
The slowest run took 4.10 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 248 µs per loop

请注意,使用lil(两面)进行分配比使用csc进行分配要快2倍.但是与lil之间的转换会花费时间.

Notice that doing the assignment with lil (both sides) is 2x faster than doing it with csc. But conversion to/from lil takes up time.

无论是否进行警告,您所做的都是最快的-一次性操作.但是,如果您需要重复执行此操作,请尝试找到更好的方法.

Warning or not, what you are doing is fastest - for a onetime operation. But if you need to do this repeatedly, try to find a better way.

=================

=================

设置行数与列数没有太大区别.

Setting rows v columns doesn't make much difference.

In [835]: %%timeit E=sparse.csc_matrix((3,3))
     ...: E[:,1]=M[:,0]
  SparseEfficiencyWarning)
1000 loops, best of 3: 1.89 ms per loop

In [836]: %%timeit E=sparse.csc_matrix((3,3))
     ...: E[1,:]=M[0,:]    
  SparseEfficiencyWarning)
1000 loops, best of 3: 1.91 ms per loop

这篇关于将列放入空的稀疏矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆