将标签转换回原始编码 [英] Transform labels back to original encoding

查看:61
本文介绍了将标签转换回原始编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张这样的桌子:

 exterior_color internal_color ... isTheftRecovered 价格0 夜黑未知... 0 16995.01 Orca Black Metallic Unknown ... 0 17995.02 亮黑色未知... 0 9995.03 亚速尔绿色金属牛轧糖棕色... 0 24495.04 黑棕 ... 0 16990.0

<块引用>

代码:

from sklearn.preprocessing import LabelEncoder从 sklearn 导入树将熊猫导入为 pd将 numpy 导入为 npdata_frame = pd.read_csv()le = LabelEncoder()cols = ['exterior_color', 'interior_color', 'location', 'make', 'model', 'mileage', 'style', 'year', 'engine', 'accidentCount','accidentCount', 'ownerCount', 'isCleanTitle', 'isFrameDamaged', 'isLemon', 'isSalvage', 'isTheftRecovered', 'price']data_frame[cols] = data_frame[cols].apply(LabelEncoder().fit_transform)exclude_price = data_frame[data_frame.columns.difference(['price'])]clf = tree.DecisionTreeClassifier()clf = clf.fit(exclude_price, data_frame.price)my_data = ['Night Black', 'Unknown', 'Patchogue, NY', '奥迪', 'Q7', '5000', 'S-line 3.0T quattro','2015', '2.0L Inline-4 Gas Turbocharged', '0', '5.0', '1', '1', '0', '0', '1']new_data = le.fit_transform(my_data)答案 = clf.predict([new_data])打印(f汽车的价格已被预测 ${answer[0]}")

此代码将对数据帧进行标签编码,然后预测给定数据的价格,但我无法将标签转换回原始编码并使用 inverse_transform 显示实际价格

解决方案

通过编码为 apply(LabelEncoder().fit_transform),您将无法访问编码器对象.相反,您可以将它们保存在以列名为键的 encoder 字典中:

from collections import defaultdict编码器 = defaultdict(LabelEncoder)df[cols] = df[cols].apply(lambda x: 编码器[x.name].fit_transform(x))

然后通过encoder['price']解码最终价格:

decoded = encoder['price'].inverse_transform(answer)[0]打印(f汽车的价格已被预测为 ${decoded:.2f}")# 汽车的价格已预测为 $16995.00

I have a table like this:

           exterior_color interior_color  ... isTheftRecovered    price
0            Night Black        Unknown  ...                0  16995.0
1    Orca Black Metallic        Unknown  ...                0  17995.0
2        Brilliant Black        Unknown  ...                0   9995.0
3  Azores Green Metallic   Nougat Brown  ...                0  24495.0
4                  Black          Brown  ...                0  16990.0

code:

from sklearn.preprocessing import LabelEncoder
from sklearn import tree
import pandas as pd
import numpy as np


data_frame = pd.read_csv()


le = LabelEncoder()


cols = ['exterior_color', 'interior_color', 'location', 'make', 'model', 'mileage', 'style', 'year', 'engine', 'accidentCount',
        'accidentCount', 'ownerCount', 'isCleanTitle', 'isFrameDamaged', 'isLemon', 'isSalvage', 'isTheftRecovered', 'price']


data_frame[cols] = data_frame[cols].apply(LabelEncoder().fit_transform)


exclude_price = data_frame[data_frame.columns.difference(['price'])]


clf = tree.DecisionTreeClassifier()
clf = clf.fit(exclude_price, data_frame.price)

my_data = ['Night Black', 'Unknown', 'Patchogue, NY', 'Audi', 'Q7', '5000', 'S-line 3.0T quattro',
           '2015', '2.0L Inline-4 Gas Turbocharged', '0', '5.0', '1', '1', '0', '0', '1']

new_data = le.fit_transform(my_data)

answer = clf.predict([new_data])

print(f"Car's price has been predicted ${answer[0]}")

This code is going to do labelencoding the dataframe and then predict the price of the given data(s) but I can not transform labels back to original encoding and use inverse_transform to show the actual price

解决方案

By encoding as apply(LabelEncoder().fit_transform), you lose access to the encoder objects. Instead you can save them in an encoder dictionary keyed by column name:

from collections import defaultdict
encoder = defaultdict(LabelEncoder)

df[cols] = df[cols].apply(lambda x: encoder[x.name].fit_transform(x))

And then decode the final price via encoder['price']:

decoded = encoder['price'].inverse_transform(answer)[0]
print(f"Car's price has been predicted as ${decoded:.2f}")

# Car's price has been predicted as $16995.00

这篇关于将标签转换回原始编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆