将标签转换回原始编码 [英] Transform labels back to original encoding
本文介绍了将标签转换回原始编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一张这样的桌子:
exterior_color internal_color ... isTheftRecovered 价格0 夜黑未知... 0 16995.01 Orca Black Metallic Unknown ... 0 17995.02 亮黑色未知... 0 9995.03 亚速尔绿色金属牛轧糖棕色... 0 24495.04 黑棕 ... 0 16990.0
<块引用>
代码:
from sklearn.preprocessing import LabelEncoder从 sklearn 导入树将熊猫导入为 pd将 numpy 导入为 npdata_frame = pd.read_csv()le = LabelEncoder()cols = ['exterior_color', 'interior_color', 'location', 'make', 'model', 'mileage', 'style', 'year', 'engine', 'accidentCount','accidentCount', 'ownerCount', 'isCleanTitle', 'isFrameDamaged', 'isLemon', 'isSalvage', 'isTheftRecovered', 'price']data_frame[cols] = data_frame[cols].apply(LabelEncoder().fit_transform)exclude_price = data_frame[data_frame.columns.difference(['price'])]clf = tree.DecisionTreeClassifier()clf = clf.fit(exclude_price, data_frame.price)my_data = ['Night Black', 'Unknown', 'Patchogue, NY', '奥迪', 'Q7', '5000', 'S-line 3.0T quattro','2015', '2.0L Inline-4 Gas Turbocharged', '0', '5.0', '1', '1', '0', '0', '1']new_data = le.fit_transform(my_data)答案 = clf.predict([new_data])打印(f汽车的价格已被预测 ${answer[0]}")
此代码将对数据帧进行标签编码,然后预测给定数据的价格,但我无法将标签转换回原始编码并使用 inverse_transform 显示实际价格
解决方案
通过编码为 apply(LabelEncoder().fit_transform)
,您将无法访问编码器对象.相反,您可以将它们保存在以列名为键的 encoder
字典中:
from collections import defaultdict编码器 = defaultdict(LabelEncoder)df[cols] = df[cols].apply(lambda x: 编码器[x.name].fit_transform(x))
然后通过encoder['price']
解码最终价格:
decoded = encoder['price'].inverse_transform(answer)[0]打印(f汽车的价格已被预测为 ${decoded:.2f}")# 汽车的价格已预测为 $16995.00
I have a table like this:
exterior_color interior_color ... isTheftRecovered price
0 Night Black Unknown ... 0 16995.0
1 Orca Black Metallic Unknown ... 0 17995.0
2 Brilliant Black Unknown ... 0 9995.0
3 Azores Green Metallic Nougat Brown ... 0 24495.0
4 Black Brown ... 0 16990.0
code:
from sklearn.preprocessing import LabelEncoder
from sklearn import tree
import pandas as pd
import numpy as np
data_frame = pd.read_csv()
le = LabelEncoder()
cols = ['exterior_color', 'interior_color', 'location', 'make', 'model', 'mileage', 'style', 'year', 'engine', 'accidentCount',
'accidentCount', 'ownerCount', 'isCleanTitle', 'isFrameDamaged', 'isLemon', 'isSalvage', 'isTheftRecovered', 'price']
data_frame[cols] = data_frame[cols].apply(LabelEncoder().fit_transform)
exclude_price = data_frame[data_frame.columns.difference(['price'])]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(exclude_price, data_frame.price)
my_data = ['Night Black', 'Unknown', 'Patchogue, NY', 'Audi', 'Q7', '5000', 'S-line 3.0T quattro',
'2015', '2.0L Inline-4 Gas Turbocharged', '0', '5.0', '1', '1', '0', '0', '1']
new_data = le.fit_transform(my_data)
answer = clf.predict([new_data])
print(f"Car's price has been predicted ${answer[0]}")
This code is going to do labelencoding the dataframe and then predict the price of the given data(s) but I can not transform labels back to original encoding and use inverse_transform to show the actual price
解决方案
By encoding as apply(LabelEncoder().fit_transform)
, you lose access to the encoder objects. Instead you can save them in an encoder
dictionary keyed by column name:
from collections import defaultdict
encoder = defaultdict(LabelEncoder)
df[cols] = df[cols].apply(lambda x: encoder[x.name].fit_transform(x))
And then decode the final price via encoder['price']
:
decoded = encoder['price'].inverse_transform(answer)[0]
print(f"Car's price has been predicted as ${decoded:.2f}")
# Car's price has been predicted as $16995.00
这篇关于将标签转换回原始编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文