如何在 Python 脚本中将 tensorflow 数据集拆分为训练、测试和验证? [英] How to split a tensorflow dataset into train, test and validation in a Python script?

查看:62
本文介绍了如何在 Python 脚本中将 tensorflow 数据集拆分为训练、测试和验证?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在带有 Tensorflow-2.0.0 的 jupyter notebook 上,以这种方式执行了 80-10-10 的训练验证测试拆分:

On a jupyter notebook with Tensorflow-2.0.0, a train-validation-test split of 80-10-10 was performed in this way:

import tensorflow_datasets as tfds
from os import getcwd
splits = tfds.Split.ALL.subsplit(weighted=(80, 10, 10))

filePath = f"{getcwd()}/../tmp2/"
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=splits, data_dir=filePath)

但是,尝试在本地运行相同的代码时出现错误

However, when trying to run the same code locally I get the error

AttributeError: type object 'Split' has no attribute 'ALL'

我已经看到我可以通过这种方式创建两个集合:

I have seen I can create two sets in this way:

splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=['train[:80]','test[80:90]'], data_dir=filePath)

但我不知道如何添加第三组.

but I do not know how I can add a third set.

推荐答案

tfds.Split.ALL.subsplittfds.Split.TRAIN.subsplit 显然已弃用并且不再支持.

tfds.Split.ALL.subsplit or tfds.Split.TRAIN.subsplit apparently are deprecated and no longer supported.

一些数据集已经在训练和测试之间拆分.在这种情况下,我找到了以下解决方案(例如使用时尚 MNIST 数据集):

Some of the datasets are already split between train and test. In this case I found the following solution (using for example the fashion MNIST dataset):

splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True,
split=['train+test[:80]','train+test[80:90]', 'train+test[90:]'],
data_dir=filePath)
(train_examples, validation_examples, test_examples) = splits

这篇关于如何在 Python 脚本中将 tensorflow 数据集拆分为训练、测试和验证?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆