Keras模型中无限数据集的steps_per_epoch和validation_steps [英] steps_per_epoch and validation_steps for infinite Dataset in Keras Model
问题描述
我有一个庞大的csv文件数据集,其容量约为200GB.我不知道数据集中的记录总数.我正在使用 make_csv_dataset 创建PreFetchDataset生成器. /p>
当Tensorflow抱怨为无限数据集指定steps_per_epoch和validation_steps时,我遇到了问题.
-
如何指定steps_per_epoch和validation_steps?
-
我可以将这些参数作为数据集总大小的百分比来传递吗?
-
我可以以某种方式避免使用这些参数,因为我希望整个数据集 每个时代都要迭代?
我认为
这是文档中的屏幕截图.但是我没有正确地得到它.
最后一行是什么意思?
除了遍历整个数据集之外,我没有其他选择.
ds = tf.data.experimental.make_csv_dataset('myfile.csv', batch_size=16, num_epochs=1)
for ix, _ in enumerate(ds, 1):
pass
print('The total number of steps is', ix)
别忘了num_epochs
参数.
I have a huge dataset of csv files having a volume of around 200GB. I don't know the total number of records in the dataset. I'm using make_csv_dataset to create a PreFetchDataset generator.
I'm facing problem when Tensorflow complains to specify steps_per_epoch and validation_steps for infinite dataset....
How can I specify the steps_per_epoch and validation_steps?
Can I pass these parameters as the percentage of total dataset size?
Can I somehow avoid these parameters as I want my whole dataset to be iterated for each epoch?
I think this SO thread answer the case when we know to total number of data records in advance.
Here is a screenshot from documentation. But I'm not getting it properly.
What does the last line mean?
I see no other option than iterating through your entire dataset.
ds = tf.data.experimental.make_csv_dataset('myfile.csv', batch_size=16, num_epochs=1)
for ix, _ in enumerate(ds, 1):
pass
print('The total number of steps is', ix)
Don't forget the num_epochs
argument.
这篇关于Keras模型中无限数据集的steps_per_epoch和validation_steps的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!