TensorFlow 中的 step 和 epochs 是什么关系? [英] What is the relationship between steps and epochs in TensorFlow?

查看:157
本文介绍了TensorFlow 中的 step 和 epochs 是什么关系?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习 TensorFlow 入门教程.在 tf.contrib.learn 示例中,有两行代码:

I am going through TensorFlow get started tutorial. In the tf.contrib.learn example, these are two lines of code:

input_fn = tf.contrib.learn.io.numpy_input_fn({"x":x}, y, batch_size=4, num_epochs=1000)
estimator.fit(input_fn=input_fn, steps=1000)

我想知道fit函数调用中的参数stepsnumpy_input_fn中的num_epochs有什么区别?代码>调用.不应该只有一个论点吗?它们是如何连接的?

I am wondering what is the difference between argument steps in the call to fit function and num_epochs in the numpy_input_fn call. Shouldn't there be just one argument? How are they connected?

我发现代码以某种方式将这两个中的 min 作为教程玩具示例中的步骤数.

I have found that code is somehow taking the min of these two as the number of steps in the toy example of the tutorial.

至少,num_epochssteps 这两个参数之一必须是多余的.我们可以从另一个计算一个.有没有办法知道我的算法实际执行了多少步(参数更新的次数)?

At least, one of the two parameters either num_epochs or steps has to be redundant. We can calculate one from the other. Is there a way I can know how many steps (number of times parameters get updated) my algorithm actually took?

我很好奇哪个优先.它是否取决于其他一些参数?

I am curious about which one takes precedence. And does it depend on some other parameters?

推荐答案

TL;DR:一个 epoch 是当你的模型遍历你的整个训练数据一次.一个步骤是当您的模型在单个批次上进行训练(如果您一个接一个地发送样本,则为单个样本).在 1000 个样本上训练 5 个 epoch 每批 10 个样本需要 500 步.

TL;DR: An epoch is when your model goes through your whole training data once. A step is when your model trains on a single batch (or a single sample if you send samples one by one). Training for 5 epochs on a 1000 samples 10 samples per batch will take 500 steps.

contrib.learn.io 模块没有很好地记录,但似乎 numpy_input_fn() 函数接受一些 numpy 数组并将它们一起批处理作为输入分类器.因此,时期的数量可能意味着在停止之前我有多少次通过输入数据".在这种情况下,它们在 4 个元素批次中提供两个长度为 4 的数组,因此这仅意味着输入函数将在引发数据不足"异常之前最多执行 1000 次.estimator fit() 函数中的 steps 参数是 estimator 应该执行多少次训练循环.这个特定的例子有点反常,所以让我再补一个让事情更清楚一点(希望如此).

The contrib.learn.io module is not documented very well, but it seems that numpy_input_fn() function takes some numpy arrays and batches them together as input for a classificator. So, the number of epochs probably means "how many times to go through the input data I have before stopping". In this case, they feed two arrays of length 4 in 4 element batches, so it will just mean that the input function will do this at most a 1000 times before raising an "out of data" exception. The steps argument in the estimator fit() function is how many times should estimator do the training loop. This particular example is somewhat perverse, so let me make up another one to make things a bit clearer (hopefully).

假设您有两个要训练的 numpy 数组(样本和标签).它们各有 100 个元素.您希望您的训练采用每批次 10 个样本的批次.因此,在 10 个批次之后,您将浏览所有的训练数据.那是一个时代.如果您将输入生成器设置为 10 个 epoch,它会在停止前通过您的训练集 10 次,即最多生成 100 个批次.

Lets say you have two numpy arrays (samples and labels) that you want to train on. They are a 100 elements each. You want your training to take batches with 10 samples per batch. So after 10 batches you will go through all of your training data. That is one epoch. If you set your input generator to 10 epochs, it will go through your training set 10 times before stopping, that is it will generate at most a 100 batches.

同样,io 模块没有文档化,但考虑到 tensorflow 中其他与输入相关的 API 是如何工作的,应该可以让它为无限数量的 epoch 生成数据,所以唯一控制训练长度的东西是是步骤.这为您希望如何进行培训提供了额外的灵活性.您可以一次进行多个 epoch 或一次进行多个步骤,或两者兼而有之.

Again, the io module is not documented, but considering how other input related APIs in tensorflow work, it should be possible to make it generate data for unlimited number of epochs, so the only thing controlling the length of training are going to be the steps. This gives you some extra flexibility on how you want your training to progress. You can go a number of epochs at a time or a number of steps at a time or both or whatever.

这篇关于TensorFlow 中的 step 和 epochs 是什么关系?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆