Scikit-learn的LabelBinarizer与OneHotEncoder [英] Scikit-learn's LabelBinarizer vs. OneHotEncoder

查看:157
本文介绍了Scikit-learn的LabelBinarizer与OneHotEncoder的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

两者之间有什么区别?似乎两者都创建了新列,其数量等于要素中唯一类别的数量。然后根据它们所在的类别将0和1分配给数据点。

What is the difference between the two? It seems that both create new columns, which their number is equal to the number of unique categories in the feature. Then they assign 0 and 1 to data points depending on what category they are in.

推荐答案

一个简单的示例,使用以下代码对数组进行编码下面显示了LabelEncoder,OneHotEncoder,LabelBinarizer。

A simple example which encodes an array using LabelEncoder, OneHotEncoder, LabelBinarizer is shown below.

我看到OneHotEncoder首先需要整数编码形式的数据才能转换为各自的编码,这在LabelBinarizer的情况下是不需要的

I see that OneHotEncoder needs data in integer encoded form first to convert into its respective encoding which is not required in the case of LabelBinarizer.

from numpy import array
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelBinarizer

# define example
data = ['cold', 'cold', 'warm', 'cold', 'hot', 'hot', 'warm', 'cold', 
'warm', 'hot']
values = array(data)
print "Data: ", values
# integer encode
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(values)
print "Label Encoder:" ,integer_encoded

# onehot encode
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
print "OneHot Encoder:", onehot_encoded

#Binary encode
lb = LabelBinarizer()
print "Label Binarizer:", lb.fit_transform(values)

另一个解释OneHotEncoder的很好的链接是: 使用python解释onehotencoder

Another good link which explains the OneHotEncoder is: Explain onehotencoder using python

有也许是专家可以解释的两者之间的其他有效区别。

There may be other valid differences between the two which experts can probably explain.

这篇关于Scikit-learn的LabelBinarizer与OneHotEncoder的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆