raise ValueError("Unknown label type: %s" % repr(ys)) ValueError: Unknown label type: (array [英] raise ValueError("Unknown label type: %s" % repr(ys)) ValueError: Unknown label type: (array

查看:633
本文介绍了raise ValueError("Unknown label type: %s" % repr(ys)) ValueError: Unknown label type: (array的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试采用机器学习方法,但遇到了一些问题.这是我的代码:

Im trying to make a Machine Learning approach but I'm having some problems. This is my Code:

import sys
import scipy
import numpy
import matplotlib
import pandas
import sklearn

from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC

dataset = pandas.read_csv('Libro111.csv')
array = numpy.asarray(dataset,dtype=numpy.float64) #all values are float64

X = array[:,1:49]
Y = array[:,0]
validation_size = 0.2
seed = 7.0
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)

scoring = 'accuracy'
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC()))
results = []
names = []
for name, model in models:
    kfold = model_selection.KFold(n_splits=10, random_state=seed)
    cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
    results.append(cv_results)
    names.append(name)
    msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
    print(msg)

然后我得到两个不同的错误.

And then I get two different errors.

对于逻辑回归:

File "C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 172, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y_type)

ValueError: Unknown label type: 'continuous'

我发现有人遇到了同样的问题,但我还没有解决..

I found someone who had the same problems but I couldn't sort it out yet..

而且(最重要的):

File "C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 97, in unique_labels
    raise ValueError("Unknown label type: %s" % repr(ys))

ValueError: Unknown label type: (array([ 0.5,  0. ,  1. ,  1. ,  0.5,  0.5,  1. ,  0.5,  0. ,  0.5,  1. ,
        0. ,  0. ,  0. ,  1. ,  1......

在这两种情况下,当我执行cv_result"行时都会出现错误......所以,我希望你能帮助我......

In both cases the error come when I execute "cv_result" line... So, I hope you can help me...

推荐答案

"ValueError: Unknown label type: 'continuous'" 表示您的Y"值不是类类型的数据(多行共享相同的整数值.每个整数代表一个类).因此,您不能使用DecisionTreeClassifier"、KNeighborsClassifier"、LogisticRegression"(不要被它的名字所迷惑,LogisticRegression 是一种布尔分类方法)或任何其他分类机器学习方法.实际上,您的Y"值全都不同或连续"(可能是浮点数),因此您只能使用回归机器学习(即RandomForestRegressor").

"ValueError: Unknown label type: 'continuous'" means Your "Y" values are not class type of data (multiple rows share a same integer value. each integer represent a class). Therefore, you cannot use "DecisionTreeClassifier", "KNeighborsClassifier", "LogisticRegression"(do not be fooled by its name, LogisticRegression is a boolean classification method) or any other classification machine learning methods. In reality, your "Y" values are all different or 'continuous' (probably are float numbers), so you can only use the regression machine learning (i.e. "RandomForestRegressor").

这里有两个解决方案:

a) 将 Y 值分组到箱(类)中.将分类建模应用于您的数据.

a) Group Y values into bins (classes). Apply classification modeling to your data.

b) 如果您希望预测具有值(浮点数),则需要使用回归机器学习方法来预测 Y 值.

b) If you prefer your predictions to have values (float numbers), You need to use the regression machine learning methods to predict Y values.

顺便说一下,scoring = 'accuracy'"评估方法是针对分类建模的.

By the way, the "scoring = 'accuracy'" evaluation method is for classification modeling.

这篇关于raise ValueError("Unknown label type: %s" % repr(ys)) ValueError: Unknown label type: (array的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆