为什么在Keras中改组我的验证集会改变模型的性能? [英] Why does shuffling my validation set in Keras change my model's performance?
问题描述
为什么让我感到困惑:
如果我在示例[A,B,C]上测试我的模型,它将获得一定的准确性.如果我在示例[C,B,A]上测试相同的模型,则应该获得相同的精度.换句话说,改组示例不应改变我模型的准确性.但这似乎正在发生在下面:
If I test my model on examples [A, B, C], it will obtain a certain accuracy. If I test the same model on examples [C, B, A], it should obtain the same accuracy. In other words, shuffling the examples shouldn't change my model's accuracy. But that's what seems to be happening below:
分步操作:
这是我训练模型的地方:
Here is where I train the model:
model.fit_generator(batches, batches.nb_sample, nb_epoch=1, verbose=2,
validation_data=val_batches,
nb_val_samples=val_batches.nb_sample)
这里是我测试模型的地方,而无需拖延验证集:
Here is where I test the model, without shuffling the validation set:
gen = ImageDataGenerator()
results = []
for _ in range(3):
val_batches = gen.flow_from_directory(path+"valid", batch_size=batch_size*2,
target_size=target_size, shuffle=False)
result = model.evaluate_generator(val_batches, val_batches.nb_sample)
results.append(result)
以下是结果(val_loss,val_acc):
Here are the results (val_loss, val_acc):
[2.8174608421325682, 0.17300000002980231]
[2.8174608421325682, 0.17300000002980231]
[2.8174608421325682, 0.17300000002980231]
请注意,验证准确性是相同的.
Notice that the validation accuracies are the same.
这里是我测试模型的地方,并进行了 shuffled 验证设置:
Here is where I test the model, with a shuffled validation set:
results = []
for _ in range(3):
val_batches = gen.flow_from_directory(path+"valid", batch_size=batch_size*2,
target_size=target_size, shuffle=True)
result = model.evaluate_generator(val_batches, val_batches.nb_sample)
results.append(result)
以下是结果(val_loss,val_acc):
Here are the results (val_loss, val_acc):
[2.8174608802795409, 0.17299999999999999]
[2.8174608554840086, 0.1730000001192093]
[2.8174608268737793, 0.17300000059604645]
请注意,尽管验证集和模型都没有改变,但验证准确性却不一致.发生了什么事?
Notice that the validation accuracies are inconsistent, despite an unchanged validation set and an unchanged model. What's going on?
注意:
我每次都对整个验证集进行评估. model.evaluate_generator 在对示例的数量等于val_batches.nb_sample
的示例进行评估后返回,验证集中的示例数.
I'm evaluating on the entire validation set each time. model.evaluate_generator returns after evaluating the model on the number of examples equal to val_batches.nb_sample
, which is the number of examples in the validation set.
推荐答案
这是一个非常有趣的问题.答案是因为神经网络使用的float32
格式不如float64
准确-这样的波动仅仅是实现下溢现象.
This is a really interesting problem. The answer is that it's because of that neural networks are using a float32
format which is not so accurate as float64
- the fluctuation like this are simply the realisation of an underflow phenomenon.
这是您丢失的情况-您可能会注意到,差异是在小数部分的第7个十进制数字之后-float32
格式的精确度是多少.因此-基本上-您可以假设示例中显示的所有数字在float32
表示形式上都是相等的.
It case of your loss - you may notice that the differences are occuring after 7th decimal digit of a fractional part - what is exactly the precision of a float32
format. So - basically - you may assume that all numbers presented in your example are equal in terms of a float32
representation.
这篇关于为什么在Keras中改组我的验证集会改变模型的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!