TensorFlow:nr.时代vs. nr.训练步骤 [英] TensorFlow: nr. of epochs vs. nr. of training steps

查看:125
本文介绍了TensorFlow:nr.时代vs. nr.训练步骤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近尝试使用Google的 seq2seq 来建立小型NMT系统.我设法使所有工作正常进行,但是我仍然想知道模型的历时和训练步骤之间的确切区别.

I have recently experimented with Google's seq2seq to set up a small NMT-system. I managed to get everything working, but I am still wondering about the exact difference between the number of epochs and the number of training steps of a model.

如果我没记错的话,一个时期包含多个训练步骤,并且一次处理完您的整个训练数据后就过去了.但是,当我查看Google自己的教程上的文档时,我不了解两者之间的区别NMT .请注意以下代码段的最后一行.

If I am not mistaken, one epoch consists of multiple training steps and has passed once your whole training data has been processed once. I do not understand, however, the difference between the two when I look at the documentation in Google's own tutorial on NMT. Note the last line of the following snippet.

export DATA_PATH=

export VOCAB_SOURCE=${DATA_PATH}/vocab.bpe.32000
export VOCAB_TARGET=${DATA_PATH}/vocab.bpe.32000
export TRAIN_SOURCES=${DATA_PATH}/train.tok.clean.bpe.32000.en
export TRAIN_TARGETS=${DATA_PATH}/train.tok.clean.bpe.32000.de
export DEV_SOURCES=${DATA_PATH}/newstest2013.tok.bpe.32000.en
export DEV_TARGETS=${DATA_PATH}/newstest2013.tok.bpe.32000.de

export DEV_TARGETS_REF=${DATA_PATH}/newstest2013.tok.de
export TRAIN_STEPS=1000000

在我看来,似乎只有一种方法来定义训练步骤的数量,而不是模型的历时数.可能有术语上的重叠,因此没有必要定义多个纪元吗?

It seems to me as if there is only a way to define the number of training steps and not the number of epochs of your model. Is it possible that there is an overlap in terminology and that it is thus not necessary to define a number of epochs?

推荐答案

一个时期包括一次遍历所有训练样本.一步/迭代是指对单个小批量进行训练.因此,如果您有1,000,000个训练样本,并且批次大小为100,那么一个历元等于10,000个步骤,每步骤100个样本.

An epoch consists of going through all your training samples once. And one step/iteration refers to training over a single minibatch. So if you have 1,000,000 training samples and use a batch size of 100, one epoch will be equivalent to 10,000 steps, with 100 samples per step.

高级神经网络框架可以让您设置时期数或训练步骤总数.但是您不能同时设置两者,因为一个因素直接决定了另一个因素的价值.

A high-level neural network framework may let you set either the number of epochs or total number of training steps. But you can't set them both since one directly determines the value of the other.

这篇关于TensorFlow:nr.时代vs. nr.训练步骤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆