稀疏矩阵:删除所有元素均为零的行 [英] scipy sparse matrix: remove the rows whose all elements are zero
问题描述
我有一个稀疏矩阵,它是从sklearn tfidfVectorier转换而来的.我相信有些行是全零行.我要删除它们.但是据我所知,现有的内置功能例如nonzero()和exclude_zero(),专注于零项,而不是行.
I have a sparse matrix which is transformed from sklearn tfidfVectorier. I believe that some rows are all-zero rows. I want to remove them. However, as far as I know, the existing built-in functions, e.g. nonzero() and eliminate_zero(), focus on zero entries, rather than rows.
有什么简单的方法可以从稀疏矩阵中删除全零行?
Is there any easy way to remove all-zero rows from a sparse matrix?
示例: 我现在所拥有的(实际上是稀疏格式):
Example: What I have now (actually in sparse format):
[ [0, 0, 0]
[1, 0, 2]
[0, 0, 1] ]
我想要得到什么:
[ [1, 0, 2]
[0, 0, 1] ]
推荐答案
切片+ getnnz()
可解决问题:
M = M[M.getnnz(1)>0]
直接在csr_array
上工作.
您还可以在不更改格式的情况下删除所有0列:
Works directly on csr_array
.
You can also remove all 0 columns without changing formats:
M = M[:,M.getnnz(0)>0]
但是,如果要同时删除这两者,则
However if you want to remove both you need
M = M[M.getnnz(1)>0][:,M.getnnz(0)>0] #GOOD
我不确定为什么
M = M[M.getnnz(1)>0, M.getnnz(0)>0] #BAD
不起作用.
这篇关于稀疏矩阵:删除所有元素均为零的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!