PyTorch中的数据增强 [英] Data Augmentation in PyTorch

查看:455
本文介绍了PyTorch中的数据增强的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对PyTorch中执行的数据扩充有些困惑。现在,据我所知,当我们执行数据扩充时,我们将保留原始数据集,然后添加它的其他版本(Flipping,Cropping等)。但这似乎并没有在PyTorch中发生。据参考资料了解,当我们在PyTorch中使用 data.transforms 时,它将一一应用。因此,例如:

  data_transforms = {
'train':transforms.Compose( [
transforms.RandomResizedCrop(224),
transforms.RandomHorizo​​ntalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406],[0.229 ,0.224,0.225])
]),
'val':transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
]),
}

在这里,为了进行训练,我们首先随机裁剪图像并将其调整大小以形成(224,224) 。然后,我们拍摄这些(224,224)图像并水平翻转它们。因此,我们的数据集现在仅包含水平翻转的图像,因此在这种情况下我们的原始图像会丢失。



对吗?这种理解正确吗?如果不是,那么我们在上面的代码中(从官方文档中获取)告诉PyTorch保留原始图像并将其大小调整为期望的形状(224,224)

谢谢

解决方案

转换操作在每个批次生成时都应用于原始图像。因此,您的数据集保持不变,每次复制仅复制和转换批处理图像。



这种混淆可能是由于经常出现的,例如您的示例, transforms 既用于数据准备(调整大小/裁剪为期望的尺寸,归一化值等),也用于数据增强(随机化调整大小/裁剪,随机翻转图像等)。 )。






您的 data_transforms ['train'] 会做什么是:




  • 随机调整提供的图像大小并随机裁剪以获得(224,224)补丁

  • 对该补丁应用或不随机进行水平翻转,机会为50/50

  • 将其转换为张量

  • 在给定的平均值和偏差值的情况下,对所得的张量进行归一化



您的 data_transforms ['val'] 的作用是:




  • 调整大小您的图像到(256,256)

  • 对裁剪后的图像进行中心裁剪以获得(224,224 )补丁

  • 将其转换为 Tensor

  • Normalize给定您提供的平均值和偏差值,得出的 Tensor



(即将训练数据的随机调整大小/裁剪替换为用于验证的固定操作,以得到可靠的验证结果)






如果您不希望以50/50的机会水平翻转训练图像,只需删除 transforms.RandomHorizo​​ntalFlip()行。



类似地,如果要始终对图像进行居中裁剪,请将 transforms.RandomResizedCrop 替换为 transforms。调整大小 transforms.CenterCrop ,就像对 data_transforms ['val'] 所做的一样。 / p>

I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping...etc). But that doesn't seem like happening in PyTorch. As far as I understood from the references, when we use data.transforms in PyTorch, then it applies them one by one. So for example:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

Here , for the training, we are first randomly cropping the image and resizing it to shape (224,224). Then we are taking these (224,224) images and horizontally flipping them. Therefore, our dataset is now containing ONLY the horizontally flipped images, so our original images are lost in this case.

Am I right? Is this understanding correct? If not, then where do we tell PyTorch in this code above (taken from Official Documentation) to keep the original images and resize them to the expected shape (224,224)?

Thanks

解决方案

The transforms operations are applied to your original images at every batch generation. So your dataset is left unchanged, only the batch images are copied and transformed every iteration.

The confusion may come from the fact that often, like in your example, transforms are used both for data preparation (resizing/cropping to expected dimensions, normalizing values, etc.) and for data augmentation (randomizing the resizing/cropping, randomly flipping the images, etc.).


What your data_transforms['train'] does is:

  • Randomly resize the provided image and randomly crop it to obtain a (224, 224) patch
  • Apply or not a random horizontal flip to this patch, with a 50/50 chance
  • Convert it to a Tensor
  • Normalize the resulting Tensor, given the mean and deviation values you provided

What your data_transforms['val'] does is:

  • Resize your image to (256, 256)
  • Center crop the resized image to obtain a (224, 224) patch
  • Convert it to a Tensor
  • Normalize the resulting Tensor, given the mean and deviation values you provided

(i.e. the random resizing/cropping for the training data is replaced by a fixed operation for the validation one, to have reliable validation results)


If you don't want your training images to be horizontally flipped with a 50/50 chance, just remove the transforms.RandomHorizontalFlip() line.

Similarly, if you want your images to always be center-cropped, replace transforms.RandomResizedCrop by transforms.Resize and transforms.CenterCrop, as done for data_transforms['val'].

这篇关于PyTorch中的数据增强的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆