标准化传递给torch.transforms.Compose函数的图像 [英] Normalizing images passed to torch.transforms.Compose function

查看:175
本文介绍了标准化传递给torch.transforms.Compose函数的图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在PyTorch中找到要传递给transforms.Normalize函数的值?另外,在我的代码中,我应该准确地进行转换.进行归一化吗?

How to find the values to pass to the transforms.Normalize function in PyTorch? Also, where in my code, should I exactly do the transforms.Normalize?

由于标准化数据集是一项众所周知的任务,我希望应该有某种脚本可以自动执行此操作.至少我在PyTorch论坛上找不到它.

Since normalizing the dataset is a pretty well-known task, I was hoping there should be some sort of script for doing that automatically. At least I couldn't find it in PyTorch forum.

transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
                                           root_dir='.',
                                           transform=transforms.Compose([
                                           Rescale(256),
                                           RandomCrop(224),
                                           transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                                           std = [ 0.229, 0.224, 0.225 ]),
                                           ToTensor()
                                               ]))
    
for i in range(len(transformed_dataset)):
    sample = transformed_dataset[i]
    print(i, sample['image'].size(), sample['landmarks'].size())
    if i == 3:
       break

我知道这些当前值与我的数据集无关,与ImageNet无关,但是使用它们实际上会出现错误:

I know these current values don't pertain to my dataset and pertain to ImageNet but using them I actually get an error:

    TypeError                                 Traceback (most recent call last)
    <ipython-input-81-eb8dc46e0284> in <module>
         10 
         11 for i in range(len(transformed_dataset)):
    ---> 12     sample = transformed_dataset[i]
         13 
         14     print(i, sample['image'].size(), sample['landmarks'].size())
    
    <ipython-input-48-9d04158922fb> in __getitem__(self, idx)
         30 
         31         if self.transform:
    ---> 32             sample = self.transform(sample)
         33 
         34         return sample
    
    ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)
         59     def __call__(self, img):
         60         for t in self.transforms:
    ---> 61             img = t(img)
         62         return img
         63 
    
    ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, tensor)
        210             Tensor: Normalized Tensor image.
        211         """
    --> 212         return F.normalize(tensor, self.mean, self.std, self.inplace)
        213 
        214     def __repr__(self):
    
    ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)
        278     """
        279     if not torch.is_tensor(tensor):
    --> 280         raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
        281 
        282     if tensor.ndimension() != 3:
    
    TypeError: tensor should be a torch tensor. Got <class 'dict'>.

所以基本上是三个问题:

So basically three questions:

  1. 如何为自己的自定义数据集找到与ImageNet mean和std中相似的值?
  2. 如何在这些地方传递这些值?我以为我应该在transforms.Compose方法中这样做,但是我可能错了.
  3. 我想我应该对整个数据集应用归一化,而不仅仅是训练集,对吗?

更新:

在这里尝试提供的解决方案对我不起作用:错误是:

<class 'dict'>

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-51-e8ba3c8718bb> in <module>
      5 for data in dataloader:
      6     print(type(data))
----> 7     batch_samples = data.size(0)
      8 
      9     data.shape(0)

AttributeError: 'dict' object has no attribute 'size'

这是打印(数据)结果:

this is print(data) result:

{'image': tensor([[[[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],
          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],
          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],
          ...,
          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],
          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],
          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]],

         [[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],
          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],
          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],
          ...,
          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],
          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],
          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]],

         [[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],
          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],
          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],
          ...,
          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],
          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],
          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]]],


        [[[0.3059, 0.3093, 0.3140,  ..., 0.3373, 0.3363, 0.3345],
          [0.3059, 0.3093, 0.3165,  ..., 0.3412, 0.3389, 0.3373],
          [0.3098, 0.3131, 0.3176,  ..., 0.3450, 0.3412, 0.3412],
          ...,
          [0.2931, 0.2966, 0.2931,  ..., 0.2549, 0.2539, 0.2510],
          [0.2902, 0.2902, 0.2902,  ..., 0.2510, 0.2510, 0.2502],
          [0.2864, 0.2900, 0.2863,  ..., 0.2510, 0.2510, 0.2510]],

         [[0.3059, 0.3093, 0.3140,  ..., 0.3373, 0.3363, 0.3345],
          [0.3059, 0.3093, 0.3165,  ..., 0.3412, 0.3389, 0.3373],
          [0.3098, 0.3131, 0.3176,  ..., 0.3450, 0.3412, 0.3412],
          ...,
          [0.2931, 0.2966, 0.2931,  ..., 0.2549, 0.2539, 0.2510],
          [0.2902, 0.2902, 0.2902,  ..., 0.2510, 0.2510, 0.2502],
          [0.2864, 0.2900, 0.2863,  ..., 0.2510, 0.2510, 0.2510]],

         [[0.3059, 0.3093, 0.3140,  ..., 0.3373, 0.3363, 0.3345],
          [0.3059, 0.3093, 0.3165,  ..., 0.3412, 0.3389, 0.3373],
          [0.3098, 0.3131, 0.3176,  ..., 0.3450, 0.3412, 0.3412],
          ...,
          [0.2931, 0.2966, 0.2931,  ..., 0.2549, 0.2539, 0.2510],
          [0.2902, 0.2902, 0.2902,  ..., 0.2510, 0.2510, 0.2502],
          [0.2864, 0.2900, 0.2863,  ..., 0.2510, 0.2510, 0.2510]]],


        [[[0.2979, 0.2980, 0.3015,  ..., 0.2825, 0.2784, 0.2784],
          [0.2980, 0.2980, 0.2980,  ..., 0.2830, 0.2764, 0.2795],
          [0.2980, 0.2980, 0.3012,  ..., 0.2827, 0.2814, 0.2797],
          ...,
          [0.3282, 0.3293, 0.3294,  ..., 0.2238, 0.2235, 0.2235],
          [0.3255, 0.3255, 0.3255,  ..., 0.2240, 0.2235, 0.2229],
          [0.3225, 0.3255, 0.3255,  ..., 0.2216, 0.2235, 0.2223]],

         [[0.2979, 0.2980, 0.3015,  ..., 0.2825, 0.2784, 0.2784],
          [0.2980, 0.2980, 0.2980,  ..., 0.2830, 0.2764, 0.2795],
          [0.2980, 0.2980, 0.3012,  ..., 0.2827, 0.2814, 0.2797],
          ...,
          [0.3282, 0.3293, 0.3294,  ..., 0.2238, 0.2235, 0.2235],
          [0.3255, 0.3255, 0.3255,  ..., 0.2240, 0.2235, 0.2229],
          [0.3225, 0.3255, 0.3255,  ..., 0.2216, 0.2235, 0.2223]],

         [[0.2979, 0.2980, 0.3015,  ..., 0.2825, 0.2784, 0.2784],
          [0.2980, 0.2980, 0.2980,  ..., 0.2830, 0.2764, 0.2795],
          [0.2980, 0.2980, 0.3012,  ..., 0.2827, 0.2814, 0.2797],
          ...,
          [0.3282, 0.3293, 0.3294,  ..., 0.2238, 0.2235, 0.2235],
          [0.3255, 0.3255, 0.3255,  ..., 0.2240, 0.2235, 0.2229],
          [0.3225, 0.3255, 0.3255,  ..., 0.2216, 0.2235, 0.2223]]]],
       dtype=torch.float64), 'landmarks': tensor([[[160.2964,  98.7339],
         [223.0788,  72.5067],
         [ 82.4163,  70.3733],
         [152.3213, 137.7867]],

        [[198.3194,  74.4341],
         [273.7188, 118.7733],
         [117.7113,  80.8000],
         [182.0750, 107.2533]],

        [[137.4789,  92.8523],
         [174.9463,  40.3467],
         [ 57.3013,  59.1200],
         [129.3375, 131.6533]]], dtype=torch.float64)}

dataloader = DataLoader(transformed_dataset, batch_size=3,
                        shuffle=True, num_workers=4)

transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
                                           root_dir='.',
                                           transform=transforms.Compose(
                                               [
                                               Rescale(256),
                                               RandomCrop(224),
                                               
                                               ToTensor()#,
                                               ##transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                                               ##         std = [ 0.229, 0.224, 0.225 ])
                                               ]
                                                                        )
                                           )

class MothLandmarksDataset(Dataset):
    """Face Landmarks dataset."""

    def __init__(self, csv_file, root_dir, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.landmarks_frame = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.landmarks_frame)

    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()

        img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])
        image = io.imread(img_name)
        landmarks = self.landmarks_frame.iloc[idx, 1:]
        landmarks = np.array([landmarks])
        landmarks = landmarks.astype('float').reshape(-1, 2)
        sample = {'image': image, 'landmarks': landmarks}

        if self.transform:
            sample = self.transform(sample)

        return sample

推荐答案

源代码错误

如何传递这些值以及在哪里传递?我想我应该在transforms.Compose方法,但我可能是错的.

How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.

MothLandmarksDataset 中,当您尝试将 Dict ( sample )传递到 torchvision时,这也就不足为奇了.转换,需要 torch.Tensor PIL.Image 作为输入.确切地说是这样:

In MothLandmarksDataset it is no wonder it is not working as you are trying to pass Dict (sample) to torchvision.transforms which require either torch.Tensor or PIL.Image as input. here to be exact:

sample = {'image': image, 'landmarks': landmarks}

if self.transform:
    sample = self.transform(sample)

可以样本[图像"] 传递到其中,尽管不应该.仅将此操作应用于 sample ["image"] 会破坏其与 landmarks 的关系.您应该追求的是类似 albumentations 库(请参见此处),可以用相同的方式转换 image landmarks 以保留它们之间的关系.

You could pass sample["image"] into it although you shouldn't. Applying this operation only to sample["image"] would break its relation to landmarks. What you should be after is something like albumentations library (see here) which can transform image and landmarks in the same way to preserve their relations.

torchvision 中也没有 Rescale 转换,也许您的意思是

Also there is no Rescale transform in torchvision, maybe you meant Resize?

提供的代码很好,,但是您必须像这样将数据解压缩到 torch.Tensor 中.

Provided code is fine, but you have to unpack your data into torch.Tensor like this:

mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
    images, landmarks = data["image"], data["landmarks"]
    batch_samples = images.size(0)

    images_data = images.view(batch_samples, images.size(1), -1)
    mean += images_data.mean(2).sum(0)
    std += images_data.std(2).sum(0)
    nb_samples += batch_samples

mean /= nb_samples
std /= nb_samples

如何在哪里传递这些值?我想我应该在transforms.Compose方法,但我可能是错的.

How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.

这些值应传递给 torchvision.transforms.Normalize 仅应用于 sample ["images"] ,而不应用于 sample ["landmarks"] .

Those values should be passed to torchvision.transforms.Normalize applied only to sample["images"], not to sample["landmarks"].

我认为我应该将Normalize应用于我的整个数据集,而不仅仅是训练集,对吗?

I assume I should apply Normalize to my entire dataset not just the training set, am I right?

您应该在整个训练数据集中计算归一化值,并将这些计算出的值也应用于验证和测试.

You should calculate normalization values across training dataset and apply those calculated values to validation and test as well.

这篇关于标准化传递给torch.transforms.Compose函数的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆