tf.data.Dataset.repeat() 与 iterator.initializer 之间的区别 [英] Difference between tf.data.Dataset.repeat() vs iterator.initializer

查看:82
本文介绍了tf.data.Dataset.repeat() 与 iterator.initializer 之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Tensorflow 有 tf.data.Dataset.repeat(x) 可以迭代数据 x 次.它还具有iterator.initializer,当iterator.get_next() 耗尽时,可以使用iterator.initializer 重新开始迭代.我的问题是使用 tf.data.Dataset.repeat(x) 技术与 iterator.initializer 时有区别吗?

Tensorflow has tf.data.Dataset.repeat(x) that iterates through the data x number of times. It also has iterator.initializer which when iterator.get_next() is exhausted, iterator.initializer can be used to restart the iteration. My question is is there difference when using tf.data.Dataset.repeat(x) technique vs iterator.initializer?

推荐答案

正如我们所知,模型训练过程中的每个 epoch 都会接收整个数据集并将其分解为批次.这发生在每个时代.假设我们有一个包含 100 个样本的数据集.在每个时期,100 个样本被分成 5 个批次(每个批次 20 个),用于将它们提供给模型.但是,如果我必须训练模型 5 个时期,那么我需要重复数据集 5 次.这意味着,重复数据集中的总元素将有 500 个样本(100 个样本乘以 5 次).

As we know, each epoch in the training process of a model takes in the whole dataset and breaks it into batches. This happens on every epoch. Suppose, we have a dataset with 100 samples. On every epoch, the 100 samples are broken into 5 batches ( of 20 each ) for feeding them to the model. But, if I have to train the model for say 5 epochs then, I need to repeat the dataset 5 times. Meaning, the total elements in the repeated dataset will have 500 samples ( 100 samples multipled 5 times ).

现在,这项工作由 tf.data.Dataset.repeat() 方法完成.通常我们将 num_epochs 参数传递给方法.

Now, this job is done by the tf.data.Dataset.repeat() method. Usually we pass the num_epochs argument to the method.

iterator.get_next() 只是从 tf.data.Dataset 中获取下一批数据的一种方式.您正在逐批迭代数据集.

The iterator.get_next() is just a way of getting the next batch of data from the tf.data.Dataset. You are iterating the dataset batch by batch.

这就是区别.tf.data.Dataset.repeat() 重复数据集中的样本,而 iterator.get_next() 以批次的形式逐一获取数据.

That's the difference. The tf.data.Dataset.repeat() repeats the samples in the dataset whereas iterator.get_next() one-by-one fetches the data in the form of batches.

这篇关于tf.data.Dataset.repeat() 与 iterator.initializer 之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆