一个热编码numpy中的二进制值 [英] one hot encode a binary value in numpy
本文介绍了一个热编码numpy中的二进制值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个如下所示的numpy数组:
I have a numpy array that looks like the following:
array([[0],[1],[1]])
我希望将其表示为一个热编码的等效项:
And I want it to be represented as the one hot encoded equivalent:
array([[1,0],[0,1],[0,1]])
任何人有什么想法吗?我尝试使用 sklearn.preprocessing.LabelBinarizer ,但这只是重新生成输入.
Any body have any ideas? I tried using sklearn.preprocessing.LabelBinarizer but this just re-produces the input.
谢谢.
编辑
根据要求,这是使用LabelBinarizer的代码
As requested, here is the code using LabelBinarizer
from sklearn.preprocessing import LabelBinarizer
train_y = np.array([[0],[1],[1]])
lb = LabelBinarizer()
lb.fit(train_y)
label_vecs = lb.transform(train_y)
输出:
array([[0],[1],[1]])
请注意,它确实在文档二进制目标转换为列向量"
Note that it does state in the documentation 'Binary targets transform to a column vector'
推荐答案
To use sklearn
, it seems we could use OneHotEncoder
, like so -
from sklearn.preprocessing import OneHotEncoder
train_y = np.array([[0],[1],[1]]) # Input
enc = OneHotEncoder()
enc.fit(train_y)
out = enc.transform(train_y).toarray()
样本输入,输出-
In [314]: train_y
Out[314]:
array([[0],
[1],
[1]])
In [315]: out
Out[315]:
array([[ 1., 0.],
[ 0., 1.],
[ 0., 1.]])
In [320]: train_y
Out[320]:
array([[9],
[4],
[1],
[6],
[2]])
In [321]: out
Out[321]:
array([[ 0., 0., 0., 0., 1.],
[ 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0.],
[ 0., 1., 0., 0., 0.]])
使用initialization
-
def initialization_based(A): # A is Input array
a = np.unique(A, return_inverse=1)[1]
out = np.zeros((a.shape[0],a.max()+1),dtype=int)
out[np.arange(out.shape[0]), a.ravel()] = 1
return out
另一个与 broadcasting
-
def broadcasting_based(A): # A is Input array
a = np.unique(A, return_inverse=1)[1]
return (a.ravel()[:,None] == np.arange(a.max()+1)).astype(int)
这篇关于一个热编码numpy中的二进制值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文