你如何改变 Pytorch 数据集的大小? [英] How do you alter the size of a Pytorch Dataset?

查看:30
本文介绍了你如何改变 Pytorch 数据集的大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我正在从 torchvision.datasets.MNIST 加载 MNIST,但我只想加载总共 10000 张图像,我将如何对数据进行切片以将其限制为仅某些数量的数据点?我知道 DataLoader 是一个生成器,生成指定批量大小的数据,但是如何对数据集进行切片?

Say I am loading MNIST from torchvision.datasets.MNIST, but I only want to load in 10000 images total, how would I slice the data to limit it to only some number of data points? I understand that the DataLoader is a generator yielding data in the size of the specified batch size, but how do you slice datasets?

tr = datasets.MNIST('../data', train=True, download=True, transform=transform)
te = datasets.MNIST('../data', train=False, transform=transform)
train_loader = DataLoader(tr, batch_size=args.batch_size, shuffle=True, num_workers=4, **kwargs)
test_loader = DataLoader(te, batch_size=args.batch_size, shuffle=True, num_workers=4, **kwargs)

推荐答案

需要注意的是,当您创建 DataLoader 对象时,它不会立即加载您的所有数据(这对于大型数据集).它为您提供了一个迭代器,您可以使用它来访问每个示例.

It is important to note that when you create the DataLoader object, it doesnt immediately load all of your data (its impractical for large datasets). It provides you an iterator that you can use to access each sample.

不幸的是,DataLoader 没有为您提供任何方法来控制您希望提取的样本数量.您将不得不使用切片迭代器的典型方法.

Unfortunately, DataLoader doesnt provide you with any way to control the number of samples you wish to extract. You will have to use the typical ways of slicing iterators.

最简单的事情(没有任何库)是在达到所需的样本数量后停止.

Simplest thing to do (without any libraries) would be to stop after the required number of samples is reached.

nsamples = 10000
for i, image, label in enumerate(train_loader):
    if i > nsamples:
        break

    # Your training code here.

或者,您可以使用 itertools.islice 获取前 10k 个样本.像这样.

Or, you could use itertools.islice to get the first 10k samples. Like so.

for image, label in itertools.islice(train_loader, stop=10000):

    # your training code here.

这篇关于你如何改变 Pytorch 数据集的大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆