如何在PyTorch中为子集使用不同的数据增强 [英] How to use different data augmentation for Subsets in PyTorch
问题描述
如何在PyTorch中对不同的子集
使用不同的数据增强(转换)?
How to use different data augmentation (transforms) for different Subset
s in PyTorch?
例如:
train, test = torch.utils.data.random_split(dataset, [80000, 2000])
火车
和测试
与数据集
具有相同的转换。如何对这些子集使用自定义转换?
train
and test
will have the same transforms as dataset
. How to use custom transforms for these subsets?
推荐答案
我当前的解决方案不是很好,但可以:
My current solution is not very elegant, but works:
from copy import copy
train_dataset, test_dataset = random_split(full_dataset, [train_size, test_size])
train_dataset.dataset = copy(full_dataset)
test_dataset.dataset.transform = transforms.Compose([
transforms.Resize(img_resolution),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
train_dataset.dataset.transform = transforms.Compose([
transforms.RandomResizedCrop(img_resolution[0]),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
基本上,我是定义一个新数据集(是原始数据集的副本) f拆分,然后为每个拆分定义自定义转换。
Basically, I'm defining a new dataset (which is a copy of the original dataset) for one of the splits, and then I define a custom transform for each split.
注意: train_dataset.dataset.transform
之所以起作用,是因为我使用的是 ImageFolder
数据集,该数据集使用 .tranform
属性执行转换。
Note: train_dataset.dataset.transform
works since I'm using an ImageFolder
dataset, which uses the .tranform
attribute to perform the transforms.
如果有人知道更好的解决方案,请与我们分享!
If anybody knows a better solution, please share with us!
这篇关于如何在PyTorch中为子集使用不同的数据增强的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!