分类指标无法处理连续多输出和多标签指标目标的混合情况 [英] classification metrics can't handle a mix of continuous-multioutput and multi-label-indicator targets

查看:229
本文介绍了分类指标无法处理连续多输出和多标签指标目标的混合情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个具有数字输入和单个类别输出的ANN,该类别被热编码为19个类别之一.我将输出层设置为19个单位.我现在不知道如何执行混淆矩阵,也不知道如何根据而不是单个二进制输出来分类器(classifier.predict()).我不断收到错误消息,说分类指标无法处理连续多输出和多标签指标目标的混合情况.不确定如何继续.

I have created an ANN with numerical inputs and a single categorical output which is one hot encoded to be 1 of 19 categories. I set my output layer to have 19 units. I don't know how to perform the confusion matrix now nor how to classifier.predict() in light of this rather than a single binary output. I keep getting an error saying classification metrics can't handle a mix of continuous-multioutput and multi-label-indicator targets. Not sure how to proceed.

#Importing Datasets
dataset=pd.read_csv('Data.csv')
x = dataset.iloc[:,1:36].values # lower bound independent variable to upper bound in a matrix (in this case only 1 column 'NC')
y = dataset.iloc[:,36:].values # dependent variable vector
print(x.shape)
print(y.shape)

#One Hot Encoding fuel rail column
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_y= LabelEncoder()
y[:,0]=labelencoder_y.fit_transform(y[:,0])
onehotencoder= OneHotEncoder(categorical_features=[0])
y = onehotencoder.fit_transform(y).toarray()
print(y[:,0:])

print(x.shape)
print (y.shape)


#splitting data into Training and Test Data
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.1,random_state=0)

#Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
#x_train = sc.fit_transform(x_train)
#x_test=sc.transform(x_test)
y_train = sc.fit_transform(y_train)
y_test=sc.transform(y_test)

# PART2 - Making ANN, deep neural network

#Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense


#Initialising ANN
classifier = Sequential()
#Adding the input layer and first hidden layer
classifier.add(Dense(activation= 'relu', input_dim =35, units=2, kernel_initializer="uniform"))#rectifier activation function, include all input with one hot encoding
#Adding second hidden layer
classifier.add(Dense(activation= 'relu', units=2, kernel_initializer="uniform")) #rectifier activation function
#Adding the Output Layer
classifier.add(Dense(activation='softmax', units=19, kernel_initializer="uniform")) 
#Compiling ANN - stochastic gradient descent
classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])#stochastic gradient descent

#Fit ANN to training set

#PART 3 - Making predictions and evaluating the model
#Fitting classifier to the training set
classifier.fit(x_train, y_train, batch_size=10, epochs=100)#original batch is 10 and epoch is 100

#Predicting the Test set rules
y_pred = classifier.predict(x_test)
y_pred = (y_pred > 0.5) #greater than 0.50 on scale 0 to 1
print(y_pred)

#Making confusion matrix that checks accuracy of the model
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

推荐答案

y_pred = (y_pred > 0.5) 

输出一个布尔矩阵.问题在于它具有与以前相同的形状,但是当您评估准确性时,您需要一个标签向量.

Outputs a boolean matrix. The problem is that it has the same shape as it had before, but when you evaluate accuracy you need a vector of labels.

为此,请使用np.argmax(y_pred, axis=1)代替,以输出正确的标签.

To do this take np.argmax(y_pred, axis=1) instead to output correct labels.

这篇关于分类指标无法处理连续多输出和多标签指标目标的混合情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆