Scikit学习LabelEncoder:IndexError:用作索引的数组必须是整数(或布尔值)类型 [英] Scikit-learn LabelEncoder: IndexError: arrays used as indices must be of integer (or boolean) type

查看:268
本文介绍了Scikit学习LabelEncoder:IndexError:用作索引的数组必须是整数(或布尔值)类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试预处理成人数据以便进行分类。我使用scikit-learn处理类别属性。

I am trying to preprocess adult data in order to make a classification. I deal with categorical attributes with scikit-learn.

from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
X[:,0] = labelencoder.fit_transform(X[:,0])
labelencoder.classes_

输出:

array(['Federal-gov', 'Local-gov', 'Private', 'Self-emp-inc',
       'Self-emp-not-inc', 'State-gov', 'Without-pay'], dtype=object)

新内容:

X[:3]
array([[5, 'Bachelors', 'Under-Graduate', 'Never-married',
        'Adm-clerical', 'Not-in-family', 'White', 'Male',
        'United-States', 39.0, 77516.0, 13.0, 2174.0, 0.0, 40.0],
       [4, 'Bachelors', 'Under-Graduate', 'Married-civ-spouse',
        'Exec-managerial', 'Husband', 'White', 'Male', 'United-States',
        50.0, 83311.0, 13.0, 0.0, 0.0, 13.0],
       [2, 'HS-grad', 'HS-grad', 'Divorced', 'Handlers-cleaners',
        'Not-in-family', 'White', 'Male', 'United-States', 38.0,
        215646.0, 9.0, 0.0, 0.0, 40.0]], dtype=object)



<到这里一切都很好。但是我需要查看原始属性并尝试返回以下内容:

Everything is fine till here. But I needed to see original attributes and try to get back with the following:

original = labelencoder.inverse_transform(X[:,0])

我收到此错误:

IndexError                                Traceback (most recent call last)
<ipython-input-78-f8cf404b255a> in <module>
----> 1 original = labelencoder.inverse_transform(X[:,0])

D:\Anaconda\lib\site-packages\sklearn\preprocessing\label.py in inverse_transform(self, y)
    281                     "y contains previously unseen labels: %s" % str(diff))
    282         y = np.asarray(y)
--> 283         return self.classes_[y]
    284 
    285 

IndexError: arrays used as indices must be of integer (or boolean) type


推荐答案

错误是由于您的数组具有对象类型强>。即使您提取第一列,该类型仍然是对象(检查 X [:,0] .dtype )。此外, inverse_transform 需要int类型。因此,要使用 inverse_transform ,您需要将向量强制转换为int:

The error comes from the fact that your array has an "object" type. And even if you extract the first column, the type remains "object" (check X[:,0].dtype). Furthermore inverse_transform requires int type. So in order to use inverse_transform you need to cast your vector to int like that:

original = labelencoder.inverse_transform(X[:,0].astype(int))

输出:

array(['a', 'b', 'c'], dtype=object)

这篇关于Scikit学习LabelEncoder:IndexError:用作索引的数组必须是整数(或布尔值)类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆