如何使用 sklearn 从 ONE-HOT-ENCODED 标签返回到单列? [英] How to go back from ONE-HOT-ENCODED labels to single column using sklearn?

查看:65
本文介绍了如何使用 sklearn 从 ONE-HOT-ENCODED 标签返回到单列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用模型预测了一些数据并得到了这种结果

I have predicted some data using model and getting this kind of results

[[0 0 0 ... 0 0 1]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 1]
 [0 0 0 ... 0 0 0]]

基本上是目标列的单热编码标签.现在我想以某种方式回到一列原始值.我用这些行来做我的编码.我怎样才能回到单列?

which are basically one-hot encoded labels of target column. Now I want to go somehow back to a single column of original values. I used these lines to do my encoding. How can I go back to sinle column?

le_candidate = LabelEncoder()
df['candidate_encoded'] = le_candidate.fit_transform(df.Candidate)
candidate_ohe = OneHotEncoder()
Y = candidate_ohe.fit_transform(df.candidate_encoded.values.reshape(-1, 1)).toarray()

推荐答案

使用LabelEncoderOneHotEncoderinverse_transform:

import pandas as pd
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

s = pd.Series(['a', 'b', 'c'])
le = LabelEncoder()
ohe = OneHotEncoder(sparse=False)
s1 = le.fit_transform(s)
s2 = ohe.fit_transform(s.to_numpy().reshape(-1, 1))

你有什么:

# s1 from LabelEncoder
array([0, 1, 2])

# s2 from OneHotEncoder
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

你应该做什么:

inv_s1 = le.inverse_transform(s1)
inv_s2 = ohe.inverse_transform(s2).ravel()

输出:

# inv_s1 == inv_s2 == s
array(['a', 'b', 'c'], dtype=object)

这篇关于如何使用 sklearn 从 ONE-HOT-ENCODED 标签返回到单列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆