加载模型时出现意外错误:预测器中的问题 - ModuleNotFoundError:没有名为“torchvision"的模块 [英] Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision'

查看:102
本文介绍了加载模型时出现意外错误:预测器中的问题 - ModuleNotFoundError:没有名为“torchvision"的模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试通过我的 vm 实例上的控制台将我的模型部署到 AI 平台以进行预测,但我收到错误(gcloud.beta.ai-platform.versions.create) Create Version failed. 检测到错误模型错误:加载模型失败:加载模型时出现意外错误:预测器中的问题 - ModuleNotFoundError:没有名为‘torchvision’的模块(错误代码:0)"

I've been trying to deploy my model to the AI platform for Prediction through the console on my vm instance, but I've gotten the error "(gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Failed to load model: Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision' (Error code: 0)"

我需要同时包含 torchtorchvision.我按照这个问题中的步骤 无法使用自定义预测例程将经过训练的模型部署到 Google Cloud Ai-Platform:模型需要的内存超出允许范围,但我无法获取用户 gogasca 指向的文件.我尝试下载 this.whl 文件从 Pytorch 网站上传到我的云存储,但得到了同样的错误,没有模块 torchvision,即使这个版本应该包含 torch 和 torchvision.还尝试使用 Cloud AI 兼容包这里,但他们没有包括torchvision.

I need to include both torch and torchvision. I followed the steps in this question Cannot deploy trained model to Google Cloud Ai-Platform with custom prediction routine: Model requires more memory than allowed, but I couldn't fetch the files pointed to by user gogasca. I tried downloading this .whl file from Pytorch website and uploading it to my cloud storage but got the same error that there is no module torchvision, even though this version is supposed to include both torch and torchvision. Also tried using Cloud AI compatible packages here, but they don't include torchvision.

我尝试在 --package-uris 参数中为 torchtorchvision 指向两个单独的 .whl 文件,这些文件指向文件在我的云存储中,但后来出现超出内存容量的错误.这很奇怪,因为它们的总大小约为 130Mb.导致 torchvision 缺失的命令示例如下所示:

I tried pointing to two separate .whl files for torch and torchvision in the --package-uris arguments, those point to files in my cloud storage, but then I got the error that the memory capacity was exceeded. This is strange, because collectively their size is around 130Mb. An example of my command that resulted in absence of torchvision looked like this:

gcloud beta ai-platform versions create version_1 
  --model online_pred_1 
  --runtime-version 1.15 
  --python-version 3.7 
  --origin gs://BUCKET/model-dir 
  --package-uris gs://BUCKET/staging-dir/my_package-0.1.tar.gz,gs://BUCKET/torchvision-dir/torch-1.4.0+cpu-cp37-cp37m-linux_x86_64.whl 
  --prediction-class predictor.MyPredictor

我尝试指向我从不同来源获得的 .whl 文件的不同组合,但得到了无模块错误或内存不足.我不明白在这种情况下模块如何交互以及为什么编译器认为没有这样的模块.我该如何解决这个问题?或者,我如何自己编译包含 torchtorchvision 的包.您能否给出详细的答案,因为我对包管理和 bash 脚本编写不是很熟悉.

I've tried pointing to different combinations of .whl files that I obtained from different sources, but got either the no module error or not enough memory. I don't understand how the modules interact in this case and why the compiler thinks there is no such module. How can I resolve this? Or alternatively, how can I compile a package myself that include both torch and torchvision. Can you please give detailed answers because I'm not very familiar with package management and bash scripting.

这是我使用的代码,torch_model.py:

from torch import nn


class EthnicityClassifier44(nn.Module):
    def __init__(self, num_classes=2):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=7, stride=1, padding=3)
        self.maxpool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv22 = nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1)
        self.maxpool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.maxpool3 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv4 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.maxpool4 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.relu = nn.ReLU(inplace=False)
        self.fc1 = nn.Linear(8*8*128, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc4 = nn.Linear(128, num_classes)


    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.maxpool1(x)
        x = self.relu(self.conv22(x))
        x = self.maxpool2(x)
        x = self.maxpool3(self.relu(self.conv3(x)))
        x = self.maxpool4(self.relu(self.conv4(x)))
        x = self.relu(self.fc1(x.view(x.shape[0], -1)))
        x = self.relu(self.fc2(x))
        x = self.fc4(x)
        return x

这是predictor_py:

from facenet_pytorch import MTCNN, InceptionResnetV1, extract_face
import torch
import torchvision
from torchvision import transforms
from torch.nn import functional as F
from PIL import Image
from sklearn.externals import joblib
import numpy as np
import os
import torch_model


class MyPredictor(object):

    import torch
    import torchvision

    def __init__(self, model, preprocessor, device):
        """Stores artifacts for prediction. Only initialized via `from_path`.
        """
        self._resnet = model
        self._mtcnn_mult = preprocessor
        self._device = device
        self.get_std_tensor = transforms.Compose([
            np.float32,
            np.uint8,
            transforms.ToTensor(),
        ])
        self.tensor2pil = transforms.ToPILImage(mode='RGB')
        self.trans_resnet = transforms.Compose([
            transforms.Resize((100, 100)),
            np.float32,
            transforms.ToTensor()
        ])

    def predict(self, instances, **kwargs):

        pil_transform = transforms.Resize((512, 512))

        imarr = np.asarray(instances)
        pil_im = Image.fromarray(imarr)
        image = pil_im.convert('RGB')
        pil_im_512 = pil_transform(image)

        boxes, _ = self._mtcnn_mult(pil_im_512)
        box = boxes[0]

        face_tensor = extract_face(pil_im_512, box, margin=40)
        std_tensor = self.get_std_tensor(face_tensor.permute(1, 2, 0))
        cropped_pil_im = self.tensor2pil(std_tensor)

        face_tensor = self.trans_resnet(cropped_pil_im)
        face_tensor4d = face_tensor.unsqueeze(0)
        face_tensor4d = face_tensor4d.to(self._device)

        prediction = self._resnet(face_tensor4d)
        preds = F.softmax(prediction, dim=1).detach().numpy().reshape(-1)
        print('probability of (class1, class2) = ({:.4f}, {:.4f})'.format(preds[0], preds[1]))

        return preds.tolist()

    @classmethod
    def from_path(cls, model_dir):
        import torch
        import torchvision
        import torch_model

        model_path = os.path.join(model_dir, 'class44_M40RefinedExtra_bin_no_norm_7860.joblib')
        classifier = joblib.load(model_path)

        mtcnn_path = os.path.join(model_dir, 'mtcnn_mult.joblib')
        mtcnn_mult = joblib.load(mtcnn_path)

        device_path = os.path.join(model_dir, 'device_cpu.joblib')
        device = joblib.load(device_path)

        return cls(classifier, mtcnn_mult, device)

setup.py:

from setuptools import setup

REQUIRED_PACKAGES = ['opencv-python-headless', 'facenet-pytorch']

setup(
 name="my_package",
 version="0.1",
 include_package_data=True,
 scripts=["predictor.py", "torch_model.py"],
 install_requires=REQUIRED_PACKAGES
)

推荐答案

解决方案是将以下包放在 thsetup.py 文件中用于自定义预测代码:

The solution was to place the following packages in thsetup.py file for the custom prediction code:

REQUIRED_PACKAGES = ['torchvision==0.5.0', 'torch @ https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl', 'opencv-python', 'facenet-pytorch']

然后我在自定义类实例化时遇到了不同的问题,但是 这篇 文章解释得很好.因此,我能够成功地将我的模型部署到 AI Platform 进行预测.

I then had a different problem with custom class instantiation, but this article explains it well. So I was able to successfully deploy my model to the AI Platform for prediction.

这篇关于加载模型时出现意外错误:预测器中的问题 - ModuleNotFoundError:没有名为“torchvision"的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆