Sklearn 尝试将字符串列表转换为浮点数 [英] Sklearn trying to convert string list to floats

查看:60
本文介绍了Sklearn 尝试将字符串列表转换为浮点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使 sklearn.svm.SVC(kernel="linear") 算法工作.我的 X 是一个由 [misc.imread(each).flatten() for each in filenames] 组成的数组,而我的 y2 是由诸如 ["A 之类的字符串组成的列表的一部分","1","4","F"..].

I am trying to make a sklearn.svm.SVC(kernel="linear") algorithm work. My X is an array made with [misc.imread(each).flatten() for each in filenames] and my y2 is a part of a list made of strings such as ["A","1","4","F"..].

当我尝试 clf.fit(X,y2) 时,sklearn 尝试将我的字符串列表转换为浮点数并失败,抛出 ValueError: could not convert string to float>.我该如何解决这个问题?

When I try to clf.fit(X,y2), sklearn tries to convert my string list into floats and fails, throwing ValueError: could not convert string to float. How can I solve this?

将 sklearn 升级到 0.15 解决了问题.

Upgrading sklearn to 0.15 solved the problem.

推荐答案

scikit-learn 中有一个辅助类,它很好地实现了这一点,它叫做 sklearn.preprocessing.LabelEncoder:

There is a helper class in scikit-learn which implements this nicely, it's called sklearn.preprocessing.LabelEncoder:

from sklearn.preprocessing import LabelEncoder
y2 = ["A","1","4","F","A","1","4","F"]
lb = LabelEncoder()
y = lb.fit_transform(y2)
# y is now: array([2, 0, 1, 3, 2, 0, 1, 3])

为了回到你原来的标签(例如使用SVC对看不见的数据进行分类后),使用LabelEncoderinverse_transform来恢复字符串标签:

In order to get back to your original labels (e.g. after classifying unseen data using SVC), use the inverse_transform of LabelEncoder to restore the string labels:

lb.inverse_transform(y)
# => array(['A', '1', '4', 'F', 'A', '1', '4', 'F'], dtype='|S1')

这篇关于Sklearn 尝试将字符串列表转换为浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆