以另一种方式对python中的稀疏矩阵进行二值化 [英] binarize a sparse matrix in python in a different way
问题描述
假设我有一个像这样的矩阵:
Assume I have a matrix like:
4 0 3 5
0 2 6 0
7 0 1 0
我希望将其二值化为:
0 0 0 0
0 1 0 0
0 0 1 0
将阈值设置为2,将大于阈值的任何元素设置为0,将小于或等于阈值(0除外)的任何元素设置为1.
That is set threshold equal to 2, any element greater than the threshold is set to 0, any element less or equal than the threshold(except 0) is set to 1.
我们可以在python的csr_matrix或任何其他稀疏矩阵上执行此操作吗?
Can we do this on python's csr_matrix or any other sparse matrix?
我知道scikit-learn提供Binarizer将低于或等于阈值的值替换为0,将高于或等于1的值替换为阈值.
I know scikit-learn offer Binarizer to replace values below or equal to the threshold by 0, above it by 1.
推荐答案
在处理稀疏矩阵s
时,请避免包含零的不等式,因为稀疏矩阵(如果使用得当,应具有很大的)许多零并且形成所有零位置的数组将是巨大的.因此,请避免使用s <= 2
.请改用选择不为零的不等式.
When dealing with a sparse matrix, s
, avoid inequalities that include zero since a sparse matrix (if you're using it appropriately) should have a great many zeros and forming an array of all the locations which are zero would be huge. So avoid s <= 2
for example. Use inequalities that select away from zero instead.
import numpy as np
from scipy import sparse
s = sparse.csr_matrix(np.array([[4, 0, 3, 5],
[0, 2, 6, 0],
[7, 0, 1, 0]]))
print(s)
# <3x4 sparse matrix of type '<type 'numpy.int64'>'
# with 7 stored elements in Compressed Sparse Row format>
s[s > 2] = 0
s[s != 0] = 1
print(s.todense())
收益
matrix([[0, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]])
这篇关于以另一种方式对python中的稀疏矩阵进行二值化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!