在Tensorflow数据集改组中BUFFER_SIZE做什么? [英] What does BUFFER_SIZE do in Tensorflow Dataset shuffling?

查看:66
本文介绍了在Tensorflow数据集改组中BUFFER_SIZE做什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我一直在使用以下代码: https://www.tensorflow.org/tutorials/generative/dcgan ,并且已经对其功能进行了很好的构思.但是,我不太了解 BUFFER_SIZE 变量的用途.我怀疑可以将其用于创建大小为 BUFFER_SIZE 的数据库的子集,然后从该子集中获取批处理,但是我看不到要点,也找不到人解释它.

So I've been play around with this code: https://www.tensorflow.org/tutorials/generative/dcgan and have almost developed a good idea about its functioning. However, I can't quite discover what is the BUFFER_SIZE variable's use. I suspect that it may be used to create a subset of the database of size BUFFER_SIZE and then the batches are taken from this subset, but I don't see the point on it and neither can find someone explaining it.

因此,如果有人可以向我解释 BUFFER_SIZE 的工作,我将很感激❤

So, if someone could explain me what BUFFER_SIZE does, I would be thankful ❤

推荐答案

它用作 tf.data.Dataset.shuffle 中的 buffer_size 参数.您是否阅读过文档?

It's used as the buffer_size argument in tf.data.Dataset.shuffle. Have you read the docs?

此数据集使用 buffer_size 元素填充缓冲区,然后从该缓冲区中随机采样元素,将所选元素替换为新元素.为了实现完美的改组,要求缓冲区大小大于或等于数据集的完整大小.

This dataset fills a buffer with buffer_size elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.

例如,如果您的数据集包含10,000个元素,但 buffer_size 设置为1,000,则shuffle最初将仅从缓冲区的前1,000个元素中选择一个随机元素.选择一个元素后,其缓冲区中的空间将被下一个(即第1,001个)元素替换,并保留1,000个元素的缓冲区.

For instance, if your dataset contains 10,000 elements but buffer_size is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.

这篇关于在Tensorflow数据集改组中BUFFER_SIZE做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆