TensorFlow中的批次是什么? [英] What is a batch in TensorFlow?

查看:150
本文介绍了TensorFlow中的批次是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读的介绍性文档(此处的TOC )使用了术语(例如此处).

The introductory documentation, which I am reading (TOC here) uses the term (for instance here) without having defined it.

推荐答案

假设您要进行数字识别(MNIST),并且已经定义了网络架构(CNN).现在,您可以开始将训练数据中的图像逐一输入到网络,进行预测(此步骤称为推断),计算损失,计算梯度,然后更新网络的参数(即 weight biases ),然后处理下一张图片...这种训练模型的方法有时称为 online学习.

Let's say you want to do digit recognition (MNIST) and you have defined your architecture of the network (CNNs). Now, you can start feeding the images from the training data one by one to the network, get the prediction (till this step it's called as doing inference), compute the loss, compute the gradient, and then update the parameters of your network (i.e. weights and biases) and then proceed with the next image ... This way of training the model is sometimes called as online learning.

但是,您希望训练更快,梯度减少噪声,并希望利用GPU的强大功能来执行阵列操作(具体来说, nD-arrays ).因此,您要做的是一次输入说100张图像(此大小的选择取决于您(即是超参数),具体取决于您的问题也).例如,看下面的图片,(作者:Martin Gorner)

But, you want the training to be faster, the gradients to be less noisy, and also take advantage of the power of GPUs which are efficient at doing array operations (nD-arrays to be specific). So, what you instead do is feed in say 100 images at a time (the choice of this size is up to you (i.e. it's a hyperparameter) and depends on your problem too). For instance, take a look at the below picture, (Author: Martin Gorner)

在这里,由于您一次要送入100张图像(28x28)(而不是在线培训中的1张),因此批量大小为100 .通常将其称为迷你批量大小或简称为mini-batch.

Here, since you're feeding in 100 images(28x28) at a time (instead of 1 as in the online training case), the batch size is 100. Oftentimes this is called as mini-batch size or simply mini-batch.

还有下面的图片:(作者:马丁·高纳)

Also the below picture: (Author: Martin Gorner)

现在,矩阵乘法将工作得非常好,并且您还将利用高度优化的数组运算,从而实现更快的 training 时间.

Now, the matrix multiplication will all just work out perfectly fine and you will also be taking advantage of the highly optimized array operations and hence achieve faster training time.

如果您观察以上图片,则只要提供100张,256张或2048张或10000张(批量大小)图像,只要适合您( GPU)硬件.您将获得许多预测.

If you observe the above picture, it doesn't matter that much whether you give 100 or 256 or 2048 or 10000 (batch size) images as long as it fits in the memory of your (GPU) hardware. You'll simply get that many predictions.

但是,请记住,这个批量大小影响训练时间,您获得的误差,梯度偏移等.对于哪种批量大小有效,没有一般的经验法则最好.只需尝试几种尺寸,然后选择最适合您的尺寸即可.但是请尽量不要使用大批量,因为它会过度拟合数据.人们通常使用32, 64, 128, 256, 512, 1024, 2048的小批量.

But, please keep in mind that this batch size influences the training time, the error that you achieve, the gradient shifts etc., There is no general rule of thumb as to which batch size works out best. Just try a few sizes and pick the one which works best for you. But try not to use large batch sizes since it will overfit the data. People commonly use mini-batch sizes of 32, 64, 128, 256, 512, 1024, 2048.

奖金:要很好地了解使用此批处理量可以多么疯狂,请阅读以下文章:

Bonus: To get a good grasp of how crazy you can go with this batch size, please give this paper a read: weird trick for parallelizing CNNs

这篇关于TensorFlow中的批次是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆