TensorFlow数据集的函数cache()和prefetch()有什么作用? [英] What do the TensorFlow Dataset's functions cache() and prefetch() do?
问题描述
我正在关注TensorFlow的图像分割教程.其中有以下几行:
I am following TensorFlow's Image Segmentation tutorial. In there there are the following lines:
train_dataset = train.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
train_dataset = train_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
-
cache()
函数的作用是什么?官方文档非常模糊且具有自引用功能:
- What does the
cache()
function do? The official documentation is pretty obscure and self-referencing:
在此数据集中缓存元素.
Caches the elements in this dataset.
-
prefetch()
函数有什么作用?官方文档还是很晦涩:
- What does the
prefetch()
function do? The official documentation is again pretty obscure:
创建一个数据集,该数据集将从该数据集中预提取元素.
Creates a Dataset that prefetches elements from this dataset.
推荐答案
tf.data.Dataset.cache
转换可以在内存或本地存储中缓存数据集.这样可以避免在每个时期执行某些操作(例如打开文件和读取数据).下一个时期将重用缓存转换所缓存的数据.
The tf.data.Dataset.cache
transformation can cache a dataset, either in memory or on local storage. This will save some operations (like file opening and data reading) from being executed during each epoch. The next epochs will reuse the data cached by the cache transformation.
您可以在tensorflow 此处中找到有关 cache
的更多信息.a>.
You can find more about the cache
in tensorflow here.
Prefetch
与训练步骤的预处理和模型执行重叠.在模型执行训练步骤s的同时,输入管道正在读取步骤s + 1的数据.这样做可以将步长时间减少到训练的最大值(而不是总和),并减少了提取数据所需的时间.
Prefetch
overlaps the preprocessing and model execution of a training step. While the model is executing training step s, the input pipeline is reading the data for step s+1. Doing so reduces the step time to the maximum (as opposed to the sum) of the training and the time it takes to extract the data.
您可以在tensorflow 此处中找到有关 prefetch
的更多信息.>.
You can find more about prefetch
in tensorflow here.
希望这能回答您的问题.学习愉快.
Hope this answers your question. Happy Learning.
这篇关于TensorFlow数据集的函数cache()和prefetch()有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!