如何在Pytorch中从单个图像中提取特征向量? [英] How to extract feature vector from single image in Pytorch?

查看:99
本文介绍了如何在Pytorch中从单个图像中提取特征向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解有关计算机视觉模型的更多信息,并且试图对它们的工作方式进行一些探索.为了进一步了解如何解释特征向量,我尝试使用Pytorch提取特征向量.下面是我从不同地方拼凑而成的代码.

I am attempting to understand more about computer vision models, and I'm trying to do some exploring of how they work. In an attempt to understand how to interpret feature vectors more I'm trying to use Pytorch to extract a feature vector. Below is my code that I've pieced together from various places.

import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image



img=Image.open("Documents/01235.png")

# Load the pretrained model
model = models.resnet18(pretrained=True)

# Use the model object to select the desired layer
layer = model._modules.get('avgpool')

# Set model to evaluation mode
model.eval()

transforms = torchvision.transforms.Compose([
        torchvision.transforms.Resize(256),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    
def get_vector(image_name):
    # Load the image with Pillow library
    img = Image.open("Documents/Documents/Driven Data Competitions/Hateful Memes Identification/data/01235.png")
    # Create a PyTorch Variable with the transformed image
    t_img = transforms(img)
    # Create a vector of zeros that will hold our feature vector
    # The 'avgpool' layer has an output size of 512
    my_embedding = torch.zeros(512)
    # Define a function that will copy the output of a layer
    def copy_data(m, i, o):
        my_embedding.copy_(o.data)
    # Attach that function to our selected layer
    h = layer.register_forward_hook(copy_data)
    # Run the model on our transformed image
    model(t_img)
    # Detach our copy function from the layer
    h.remove()
    # Return the feature vector
    return my_embedding

pic_vector = get_vector(img)

执行此操作时,出现以下错误:

When I do this I get the following error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 224, 224] instead

我确定这是一个基本错误,但是我似乎无法弄清楚该如何解决.我的印象是,张量"转换会使我的数据变成4-d,但似乎要么无法正常工作,要么我误会了它.感谢任何我可以用来了解更多有关此方面的帮助或资源!

I'm sure this is an elementary error, but I can't seem to figure out how to fix this. It was my impression that the "totensor" transformation would make my data 4-d, but it seems it's either not working correctly or I'm misunderstanding it. Appreciate any help or resources I can use to learn more about this!

推荐答案

pytorch中所有默认的 nn.Modules 都需要额外的批处理尺寸.如果模块的输入为形状(B,...),则输出也将为(B,...)(尽管以后的尺寸可能会随层而变化).此行为允许同时对一批B输入进行有效推断.要使您的代码符合要求,您可以 取消压缩 t_img 张量的前面添加一个附加的ary维,然后将其发送到模型中以使其成为(1,...)张量.您还需要 展平 my_embedding 张量中,请先存储 layer 的输出,然后再存储它.

All the default nn.Modules in pytorch expect an additional batch dimension. If the input to a module is shape (B, ...) then the output will be (B, ...) as well (though the later dimensions may change depending on the layer). This behavior allows efficient inference on batches of B inputs simultaneously. To make your code conform you can just unsqueeze an additional unitary dimension onto the front of t_img tensor before sending it into your model to make it a (1, ...) tensor. You will also need to flatten the output of layer before storing it if you want to copy it into your one-dimensional my_embedding tensor.

其他一些事情:

  • 您应该在 torch.no_grad()上下文中进行推断,以避免计算梯度,因为您将不需要梯度(请注意, model.eval()>只是更改某些层的行为,例如退出和批处理规范化,它不会禁用计算图的构造,但是 torch.no_grad()会这样做).

  • You should infer within a torch.no_grad() context to avoid computing gradients since you won't be needing them (note that model.eval() just changes the behavior of certain layers like dropout and batch normalization, it doesn't disable construction of the computation graph, but torch.no_grad() does).

我认为这只是一个复制粘贴问题,但是 transforms 是导入模块的名称以及全局变量.

I assume this is just a copy paste issue but transforms is the name of an imported module as well as a global variable.

o.data 仅返回 o 的副本.在旧的 Variable 界面(大约在PyTorch 0.3.1及更早版本)中,这曾经是必需的,但是 Variable 界面是

o.data is just returning a copy of o. In the old Variable interface (circa PyTorch 0.3.1 and earlier) this used to be necessary, but the Variable interface was deprecated way back in PyTorch 0.4.0 and no longer does anything useful; now its use just creates confusion. Unfortunately, many tutorials are still being written using this old and unnecessary interface.

更新后的代码如下:

import torch
import torchvision
import torchvision.models as models
from PIL import Image

img = Image.open("Documents/01235.png")

# Load the pretrained model
model = models.resnet18(pretrained=True)

# Use the model object to select the desired layer
layer = model._modules.get('avgpool')

# Set model to evaluation mode
model.eval()

transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize(256),
    torchvision.transforms.CenterCrop(224),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])


def get_vector(image):
    # Create a PyTorch tensor with the transformed image
    t_img = transforms(image)
    # Create a vector of zeros that will hold our feature vector
    # The 'avgpool' layer has an output size of 512
    my_embedding = torch.zeros(512)

    # Define a function that will copy the output of a layer
    def copy_data(m, i, o):
        my_embedding.copy_(o.flatten())                 # <-- flatten

    # Attach that function to our selected layer
    h = layer.register_forward_hook(copy_data)
    # Run the model on our transformed image
    with torch.no_grad():                               # <-- no_grad context
        model(t_img.unsqueeze(0))                       # <-- unsqueeze
    # Detach our copy function from the layer
    h.remove()
    # Return the feature vector
    return my_embedding


pic_vector = get_vector(img)

这篇关于如何在Pytorch中从单个图像中提取特征向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆