Keras评估_生成器准确性和scikit学习准确性_分数不一致 [英] Keras evaluate_generator accuracy and scikit learn accuracy_score inconsistent
问题描述
我正在使用Keras ImageDataGenerator类加载,训练和预测.我已经在此处尝试了解决方案,但是仍然存在问题.我不确定是否在此处中遇到了相同的问题.我猜我的y_pred
和y_test
没有正确地映射到彼此.
I am using Keras ImageDataGenerator class to load, train and predict. I had tried the solutions here, but still have the issue. I am not sure if I have the same issue as mentioned here. I guess my y_pred
and y_test
are not correctly mapped to each other.
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
validation_generator2 = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
loss, acc = model.evaluate_generator(validation_generator,
steps=math.ceil(validation_generator.samples / batch_size),
verbose=0,
workers=1)
y_pred = model.predict_generator(validation_generator2,
steps=math.ceil(validation_generator2.samples / batch_size),
verbose=0,
workers=1)
y_pred = np.argmax(y_pred, axis=-1)
y_test = validation_generator2.classes[validation_generator2.index_array]
print('loss: ', loss, 'accuracy: ', acc) # loss: 0.47286026436090467 accuracy: 0.864
print('accuracy_score: ', accuracy_score(y_test, y_pred)) # accuracy_score: 0.095
Keras的evaluate_generator
和scikit learning的accuracy_score
给出了不同的准确性.当然,当我使用scikit learning中的confusion_matrix(y_test, y_pred)
时,这给了我错误的混淆矩阵.我犯什么错误? (按y_test
我的意思是y_true
)
The evaluate_generator
from Keras and accuracy_score
from scikit learn gives different accuracy. And of course this gave me wrong confusion matrix when I use confusion_matrix(y_test, y_pred)
from scikit learn. What mistake am I making? (by y_test
I mean y_true
)
更新:
为了显示y_test
和y_pred
不一致,我在打印每个类的准确性.
Update:
To show that y_test
and y_pred
are inconsistent, I am printing the accuracy of each class.
cm = confusion_matrix(y_test, y_pred)
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
cm.diagonal()
acc_each_class = cm.diagonal()
print('accuracy of each class: \n')
for i in range(len(labels)):
print(labels[i], ' : ', acc_each_class[i])
print('\n')
'''
accuracy of each class:
cannoli : 0.085
dumplings : 0.065
edamame : 0.1
falafel : 0.125
french_fries : 0.12
grilled_cheese_sandwich : 0.13
hot_dog : 0.075
seaweed_salad : 0.085
tacos : 0.105
takoyaki : 0.135
可以看出,每个类别的准确性都太低.
As can be seen, the accuracy of each class are too low.
Update2:我如何训练模型可能会有所帮助
Update2: How I trained the model, may help
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='training')
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
validation_generator2 = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
loss = CategoricalCrossentropy()
model.compile(optimizer=SGD(lr=lr, momentum=momentum),
loss=loss,
metrics=['accuracy'])
history = model.fit_generator(train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data=validation_generator,
validation_steps=validation_generator.samples // batch_size,
epochs=epochs,
verbose=1,
callbacks=[csv_logger, checkpointer],
workers=12)
推荐答案
首先,您应该为San_evaluate_generator和predict_generator使用相同的生成器.
First of all, you should be using the same generator for both evaluate_generator and predict_generator as stated by San.
Secondly, I think the accuracy between sklearn and keras are not exactly the same as stated in the sklearn documentation accuracy_score in case of multiclass is really the jaccard score.
此链接显示了不同之处: https://stats.stackexchange.com/questions/255465/accuracy-vs- jaccard-for-multiclass-problem
This link shows the difference: https://stats.stackexchange.com/questions/255465/accuracy-vs-jaccard-for-multiclass-problem
这篇关于Keras评估_生成器准确性和scikit学习准确性_分数不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!