tf.data.Dataset.repeat() 与 iterator.initializer 之间的区别 [英] Difference between tf.data.Dataset.repeat() vs iterator.initializer
问题描述
Tensorflow 有 tf.data.Dataset.repeat(x)
可以迭代数据 x
次.它还具有iterator.initializer
,当iterator.get_next()
耗尽时,可以使用iterator.initializer
重新开始迭代.我的问题是使用 tf.data.Dataset.repeat(x)
技术与 iterator.initializer
时有区别吗?
Tensorflow has tf.data.Dataset.repeat(x)
that iterates through the data x
number of times. It also has iterator.initializer
which when iterator.get_next()
is exhausted, iterator.initializer
can be used to restart the iteration. My question is is there difference when using tf.data.Dataset.repeat(x)
technique vs iterator.initializer
?
推荐答案
正如我们所知,模型训练过程中的每个 epoch 都会接收整个数据集并将其分解为批次.这发生在每个时代.假设我们有一个包含 100 个样本的数据集.在每个时期,100 个样本被分成 5 个批次(每个批次 20 个),用于将它们提供给模型.但是,如果我必须训练模型 5 个时期,那么我需要重复数据集 5 次.这意味着,重复数据集中的总元素将有 500 个样本(100 个样本乘以 5 次).
As we know, each epoch in the training process of a model takes in the whole dataset and breaks it into batches. This happens on every epoch. Suppose, we have a dataset with 100 samples. On every epoch, the 100 samples are broken into 5 batches ( of 20 each ) for feeding them to the model. But, if I have to train the model for say 5 epochs then, I need to repeat the dataset 5 times. Meaning, the total elements in the repeated dataset will have 500 samples ( 100 samples multipled 5 times ).
现在,这项工作由 tf.data.Dataset.repeat()
方法完成.通常我们将 num_epochs
参数传递给方法.
Now, this job is done by the tf.data.Dataset.repeat()
method. Usually we pass the num_epochs
argument to the method.
iterator.get_next()
只是从 tf.data.Dataset
中获取下一批数据的一种方式.您正在逐批迭代数据集.
The iterator.get_next()
is just a way of getting the next batch of data from the tf.data.Dataset
. You are iterating the dataset batch by batch.
这就是区别.tf.data.Dataset.repeat()
重复数据集中的样本,而 iterator.get_next()
以批次的形式逐一获取数据.
That's the difference. The tf.data.Dataset.repeat()
repeats the samples in the dataset whereas iterator.get_next()
one-by-one fetches the data in the form of batches.
这篇关于tf.data.Dataset.repeat() 与 iterator.initializer 之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!