为什么在Keras中不同的批次大小会给出不同的精度? [英] Why does different batch-sizes give different accuracy in Keras?

查看:310
本文介绍了为什么在Keras中不同的批次大小会给出不同的精度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Keras的CNN对MNIST数据集进行分类.我发现使用不同的批处理大小会产生不同的精度.为什么会这样呢?

I was using Keras' CNN to classify MNIST dataset. I found that using different batch-sizes gave different accuracies. Why is it so?

使用批量大小1000 (Acc = 0.97600)

Using Batch-size 1000 (Acc = 0.97600)

使用批量大小10 (Acc = 0.97599)

Using Batch-size 10 (Acc = 0.97599)

尽管差异很小,为什么还存在差异? 编辑-我发现差异只是因为精度问题,实际上它们是相等的.

Although, the difference is very small, why is there even a difference? EDIT - I have found that the difference is only because of precision issues and they are in fact equal.

推荐答案

这是因为在训练过程中出现了Mini-batch梯度下降效应.您可以在此处找到很好的解释我在这里提到了该链接的一些注释:

That is because of the Mini-batch gradient descent effect during training process. You can find good explanation Here that I mention some notes from that link here:

批量大小是学习过程中的滑块.

Batch size is a slider on the learning process.

  1. 小的价值观赋予了学习过程一个快速收敛的机会. 训练过程中的噪音成本.
  2. 大价值给我们学习 准确估计误差的过程缓慢收敛 渐变.
  1. Small values give a learning process that converges quickly at the cost of noise in the training process.
  2. Large values give a learning process that converges slowly with accurate estimates of the error gradient.

以及该链接中的一个重要说明是:

and also one important note from that link is :

提出的结果证实,使用小批量可以达到最佳训练稳定性和泛化性能, 给定的计算成本,可以进行广泛的实验.在所有 批量大小为m = 32或的情况下获得最佳结果的情况 较小

The presented results confirm that using small batch sizes achieves the best training stability and generalization performance, for a given computational cost, across a wide range of experiments. In all cases the best results have been obtained with batch sizes m = 32 or smaller

这是 本文 的结果.

Which is the result of this paper.

编辑

我还要在这里再说两点:

I should mention two more points Here:

  1. 由于机器学习算法中的内在随机性概念,通常您不应期望机器学习算法(如深度学习算法)在不同的运行过程中具有相同的结果.您可以在此处找到更多详细信息.
  2. 另一方面,您的两个结果都太接近了,以某种方式它们是相等的.因此,根据您的情况,根据报告的结果,我们可以说批次大小对您的网络结果没有影响.
  1. because of the inherent randomness in machine learning algorithms concept, generally you should not expect machine learning algorithms (like Deep learning algorithms) to have same results on different runs. You can find more details Here.
  2. On the other hand both of your results are too close and somehow they are equal. So in your case we can say that the batch size has no effect on your network results based on the reported results.

这篇关于为什么在Keras中不同的批次大小会给出不同的精度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆