标准化传递给torch.transforms.Compose函数的图像 [英] Normalizing images passed to torch.transforms.Compose function
问题描述
如何在PyTorch中找到要传递给transforms.Normalize函数的值?另外,在我的代码中,我应该准确地进行转换.进行归一化吗?
How to find the values to pass to the transforms.Normalize function in PyTorch? Also, where in my code, should I exactly do the transforms.Normalize?
由于标准化数据集是一项众所周知的任务,我希望应该有某种脚本可以自动执行此操作.至少我在PyTorch论坛上找不到它.
Since normalizing the dataset is a pretty well-known task, I was hoping there should be some sort of script for doing that automatically. At least I couldn't find it in PyTorch forum.
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
root_dir='.',
transform=transforms.Compose([
Rescale(256),
RandomCrop(224),
transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
std = [ 0.229, 0.224, 0.225 ]),
ToTensor()
]))
for i in range(len(transformed_dataset)):
sample = transformed_dataset[i]
print(i, sample['image'].size(), sample['landmarks'].size())
if i == 3:
break
我知道这些当前值与我的数据集无关,与ImageNet无关,但是使用它们实际上会出现错误:
I know these current values don't pertain to my dataset and pertain to ImageNet but using them I actually get an error:
TypeError Traceback (most recent call last)
<ipython-input-81-eb8dc46e0284> in <module>
10
11 for i in range(len(transformed_dataset)):
---> 12 sample = transformed_dataset[i]
13
14 print(i, sample['image'].size(), sample['landmarks'].size())
<ipython-input-48-9d04158922fb> in __getitem__(self, idx)
30
31 if self.transform:
---> 32 sample = self.transform(sample)
33
34 return sample
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)
59 def __call__(self, img):
60 for t in self.transforms:
---> 61 img = t(img)
62 return img
63
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, tensor)
210 Tensor: Normalized Tensor image.
211 """
--> 212 return F.normalize(tensor, self.mean, self.std, self.inplace)
213
214 def __repr__(self):
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)
278 """
279 if not torch.is_tensor(tensor):
--> 280 raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
281
282 if tensor.ndimension() != 3:
TypeError: tensor should be a torch tensor. Got <class 'dict'>.
所以基本上是三个问题:
So basically three questions:
- 如何为自己的自定义数据集找到与ImageNet mean和std中相似的值?
- 如何在这些地方传递这些值?我以为我应该在transforms.Compose方法中这样做,但是我可能错了.
- 我想我应该对整个数据集应用归一化,而不仅仅是训练集,对吗?
更新:
在这里尝试提供的解决方案对我不起作用:错误是:
<class 'dict'>
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-51-e8ba3c8718bb> in <module>
5 for data in dataloader:
6 print(type(data))
----> 7 batch_samples = data.size(0)
8
9 data.shape(0)
AttributeError: 'dict' object has no attribute 'size'
这是打印(数据)结果:
this is print(data) result:
{'image': tensor([[[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]],
[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]],
[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]]],
[[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]],
[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]],
[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]]],
[[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]],
[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]],
[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]]]],
dtype=torch.float64), 'landmarks': tensor([[[160.2964, 98.7339],
[223.0788, 72.5067],
[ 82.4163, 70.3733],
[152.3213, 137.7867]],
[[198.3194, 74.4341],
[273.7188, 118.7733],
[117.7113, 80.8000],
[182.0750, 107.2533]],
[[137.4789, 92.8523],
[174.9463, 40.3467],
[ 57.3013, 59.1200],
[129.3375, 131.6533]]], dtype=torch.float64)}
dataloader = DataLoader(transformed_dataset, batch_size=3,
shuffle=True, num_workers=4)
和
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
root_dir='.',
transform=transforms.Compose(
[
Rescale(256),
RandomCrop(224),
ToTensor()#,
##transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
## std = [ 0.229, 0.224, 0.225 ])
]
)
)
和
class MothLandmarksDataset(Dataset):
"""Face Landmarks dataset."""
def __init__(self, csv_file, root_dir, transform=None):
"""
Args:
csv_file (string): Path to the csv file with annotations.
root_dir (string): Directory with all the images.
transform (callable, optional): Optional transform to be applied
on a sample.
"""
self.landmarks_frame = pd.read_csv(csv_file)
self.root_dir = root_dir
self.transform = transform
def __len__(self):
return len(self.landmarks_frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])
image = io.imread(img_name)
landmarks = self.landmarks_frame.iloc[idx, 1:]
landmarks = np.array([landmarks])
landmarks = landmarks.astype('float').reshape(-1, 2)
sample = {'image': image, 'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
return sample
推荐答案
源代码错误
如何传递这些值以及在哪里传递?我想我应该在transforms.Compose方法,但我可能是错的.
How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.
在 MothLandmarksDataset
中,当您尝试将 Dict
( sample
)传递到 torchvision时,这也就不足为奇了.转换
,需要 torch.Tensor
或 PIL.Image
作为输入.确切地说是这样:
In MothLandmarksDataset
it is no wonder it is not working as you are trying to pass Dict
(sample
) to torchvision.transforms
which require either torch.Tensor
or PIL.Image
as input. here to be exact:
sample = {'image': image, 'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
您可以将样本[图像"]
传递到其中,尽管不应该.仅将此操作应用于 sample ["image"]
会破坏其与 landmarks
的关系.您应该追求的是类似 albumentations
库(请参见此处),可以用相同的方式转换 image
和 landmarks
以保留它们之间的关系.
You could pass sample["image"]
into it although you shouldn't. Applying this operation only to sample["image"]
would break its relation to landmarks
. What you should be after is something like albumentations
library (see here) which can transform image
and landmarks
in the same way to preserve their relations.
在 torchvision
中也没有 Rescale
转换,也许您的意思是
Also there is no Rescale
transform in torchvision
, maybe you meant Resize?
提供的代码很好,,但是您必须像这样将数据解压缩到 torch.Tensor
中.
Provided code is fine, but you have to unpack your data into torch.Tensor
like this:
mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
images, landmarks = data["image"], data["landmarks"]
batch_samples = images.size(0)
images_data = images.view(batch_samples, images.size(1), -1)
mean += images_data.mean(2).sum(0)
std += images_data.std(2).sum(0)
nb_samples += batch_samples
mean /= nb_samples
std /= nb_samples
如何在哪里传递这些值?我想我应该在transforms.Compose方法,但我可能是错的.
How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.
这些值应传递给 torchvision.transforms.Normalize
仅应用于 sample ["images"]
,而不应用于 sample ["landmarks"]
.
Those values should be passed to torchvision.transforms.Normalize
applied only to sample["images"]
, not to sample["landmarks"]
.
我认为我应该将Normalize应用于我的整个数据集,而不仅仅是训练集,对吗?
I assume I should apply Normalize to my entire dataset not just the training set, am I right?
您应该在整个训练数据集中计算归一化值,并将这些计算出的值也应用于验证和测试.
You should calculate normalization values across training dataset and apply those calculated values to validation and test as well.
这篇关于标准化传递给torch.transforms.Compose函数的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!