在Tensorflow数据集改组中BUFFER_SIZE做什么? [英] What does BUFFER_SIZE do in Tensorflow Dataset shuffling?
问题描述
因此,我一直在使用以下代码: https://www.tensorflow.org/tutorials/generative/dcgan ,并且已经对其功能进行了很好的构思.但是,我不太了解 BUFFER_SIZE 变量的用途.我怀疑可以将其用于创建大小为 BUFFER_SIZE 的数据库的子集,然后从该子集中获取批处理,但是我看不到要点,也找不到人解释它.
So I've been play around with this code: https://www.tensorflow.org/tutorials/generative/dcgan and have almost developed a good idea about its functioning. However, I can't quite discover what is the BUFFER_SIZE variable's use. I suspect that it may be used to create a subset of the database of size BUFFER_SIZE and then the batches are taken from this subset, but I don't see the point on it and neither can find someone explaining it.
因此,如果有人可以向我解释 BUFFER_SIZE 的工作,我将很感激❤
So, if someone could explain me what BUFFER_SIZE does, I would be thankful ❤
推荐答案
它用作 tf.data.Dataset.shuffle
中的 buffer_size
参数.您是否阅读过文档?
It's used as the buffer_size
argument in tf.data.Dataset.shuffle
. Have you read the docs?
此数据集使用
buffer_size
元素填充缓冲区,然后从该缓冲区中随机采样元素,将所选元素替换为新元素.为了实现完美的改组,要求缓冲区大小大于或等于数据集的完整大小.
This dataset fills a buffer with
buffer_size
elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.
例如,如果您的数据集包含10,000个元素,但
buffer_size
设置为1,000,则shuffle最初将仅从缓冲区的前1,000个元素中选择一个随机元素.选择一个元素后,其缓冲区中的空间将被下一个(即第1,001个)元素替换,并保留1,000个元素的缓冲区.
For instance, if your dataset contains 10,000 elements but
buffer_size
is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.
这篇关于在Tensorflow数据集改组中BUFFER_SIZE做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!