Keras模型中无限数据集的steps_per_epoch和validation_steps [英] steps_per_epoch and validation_steps for infinite Dataset in Keras Model

查看:488
本文介绍了Keras模型中无限数据集的steps_per_epoch和validation_steps的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个庞大的csv文件数据集,其容量约为200GB.我不知道数据集中的记录总数.我正在使用 make_csv_dataset 创建PreFetchDataset生成器. /p>

当Tensorflow抱怨为无限数据集指定steps_per_epoch和validation_steps时,我遇到了问题.

  1. 如何指定steps_per_epoch和validation_steps?

  2. 我可以将这些参数作为数据集总大小的百分比来传递吗?

  3. 我可以以某种方式避免使用这些参数,因为我希望整个数据集 每个时代都要迭代?

我认为此SO 线程回答事先知道数据记录总数的情况.

这是文档中的屏幕截图.但是我没有正确地得到它.

最后一行是什么意思?

解决方案

除了遍历整个数据集之外,我没有其他选择.

ds = tf.data.experimental.make_csv_dataset('myfile.csv', batch_size=16, num_epochs=1)

for ix, _ in enumerate(ds, 1):
    pass

print('The total number of steps is', ix)

别忘了num_epochs参数.

I have a huge dataset of csv files having a volume of around 200GB. I don't know the total number of records in the dataset. I'm using make_csv_dataset to create a PreFetchDataset generator.

I'm facing problem when Tensorflow complains to specify steps_per_epoch and validation_steps for infinite dataset....

  1. How can I specify the steps_per_epoch and validation_steps?

  2. Can I pass these parameters as the percentage of total dataset size?

  3. Can I somehow avoid these parameters as I want my whole dataset to be iterated for each epoch?

I think this SO thread answer the case when we know to total number of data records in advance.

Here is a screenshot from documentation. But I'm not getting it properly.

What does the last line mean?

解决方案

I see no other option than iterating through your entire dataset.

ds = tf.data.experimental.make_csv_dataset('myfile.csv', batch_size=16, num_epochs=1)

for ix, _ in enumerate(ds, 1):
    pass

print('The total number of steps is', ix)

Don't forget the num_epochs argument.

这篇关于Keras模型中无限数据集的steps_per_epoch和validation_steps的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆