LabelEncoder: TypeError: '>''float' 和 'str' 的实例之间不支持 [英] LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

查看:43
本文介绍了LabelEncoder: TypeError: '>''float' 和 'str' 的实例之间不支持的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

即使处理缺失值,我也面临多个变量的错误.例如:

I'm facing this error for multiple variables even treating missing values. For example:

le = preprocessing.LabelEncoder()
categorical = list(df.select_dtypes(include=['object']).columns.values)
for cat in categorical:
    print(cat)
    df[cat].fillna('UNK', inplace=True)
    df[cat] = le.fit_transform(df[cat])
#     print(le.classes_)
#     print(le.transform(le.classes_))


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-424a0952f9d0> in <module>()
      4     print(cat)
      5     df[cat].fillna('UNK', inplace=True)
----> 6     df[cat] = le.fit_transform(df[cat].fillna('UNK'))
      7 #     print(le.classes_)
      8 #     print(le.transform(le.classes_))

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)
    129         y = column_or_1d(y, warn=True)
    130         _check_numpy_unicode_bug(y)
--> 131         self.classes_, y = np.unique(y, return_inverse=True)
    132         return y
    133 

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, return_index, return_inverse, return_counts)
    209 
    210     if optional_indices:
--> 211         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
    212         aux = ar[perm]
    213     else:

TypeError: '>' not supported between instances of 'float' and 'str'

检查导致错误结果的变量:

Checking the variable that lead to the error results ins:

df['CRM do Médico'].isnull().sum()
0

除了 nan 值之外,还有什么可能导致此错误?

Besides nan values, what could be causing this error?

推荐答案

这是由于 df[cat] 系列包含具有不同数据类型的元素,例如(字符串和/或浮点数).这可能是由于读取数据的方式造成的,即数字被读取为浮点数,文本被读取为字符串,或者数据类型是浮点数并在 fillna 操作后更改.

This is due to the series df[cat] containing elements that have varying data types e.g.(strings and/or floats). This could be due to the way the data is read, i.e. numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.

换句话说

pandas 数据类型 'Object' 表示混合类型而不是 str 类型

pandas data type 'Object' indicates mixed types rather than str type

所以使用以下行:

so using the following line:

df[cat] = le.fit_transform(df[cat].astype(str))


应该有帮助

这篇关于LabelEncoder: TypeError: '&gt;''float' 和 'str' 的实例之间不支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆