加载模型时发生意外错误:预测变量中的问题-ModuleNotFoundError:没有名为"torchvision"的模块 [英] Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision'

查看:161
本文介绍了加载模型时发生意外错误:预测变量中的问题-ModuleNotFoundError:没有名为"torchvision"的模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直试图通过我的vm实例上的控制台将模型部署到用于预测的AI平台,但是我遇到了错误((gcloud.beta.ai-platform.versions.create)创建版本失败.错误的模型,检测到错误:无法加载模型:加载模型时发生意外错误:预测变量中的问题-ModuleNotFoundError:没有名为'torchvision'的模块(错误代码:0)"

I've been trying to deploy my model to the AI platform for Prediction through the console on my vm instance, but I've gotten the error "(gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Failed to load model: Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision' (Error code: 0)"

我需要同时包含 torch torchvision .我按照此问题中的步骤进行操作来自Pytorch网站的.whl文件并将其上传到我的云存储中,但出现了相同的错误,即没有模块 torchvision ,即使该版本本应同时包含Torch和Torchvision.还尝试在此处中使用与Cloud AI兼容的软件包,但它们没有包括 torchvision .

I need to include both torch and torchvision. I followed the steps in this question Cannot deploy trained model to Google Cloud Ai-Platform with custom prediction routine: Model requires more memory than allowed, but I couldn't fetch the files pointed to by user gogasca. I tried downloading this .whl file from Pytorch website and uploading it to my cloud storage but got the same error that there is no module torchvision, even though this version is supposed to include both torch and torchvision. Also tried using Cloud AI compatible packages here, but they don't include torchvision.

我尝试在-package-uris 参数中指向 torch torchvision 的两个单独的.whl文件,这些文件指向文件在我的云存储中,但是随后出现错误,显示内存容量已超出.这很奇怪,因为它们的总大小约为130Mb.我的命令的一个示例导致了 torchvision 的缺失,如下所示:

I tried pointing to two separate .whl files for torch and torchvision in the --package-uris arguments, those point to files in my cloud storage, but then I got the error that the memory capacity was exceeded. This is strange, because collectively their size is around 130Mb. An example of my command that resulted in absence of torchvision looked like this:

gcloud beta ai-platform versions create version_1 \
  --model online_pred_1 \
  --runtime-version 1.15 \
  --python-version 3.7 \
  --origin gs://BUCKET/model-dir \
  --package-uris gs://BUCKET/staging-dir/my_package-0.1.tar.gz,gs://BUCKET/torchvision-dir/torch-1.4.0+cpu-cp37-cp37m-linux_x86_64.whl \
  --prediction-class predictor.MyPredictor

我尝试指向从不同来源获得的.whl文件的不同组合,但是没有模块错误或内存不足.我不了解模块在这种情况下如何交互,以及为什么编译器认为没有这样的模块.我该如何解决?或者,我如何自己编译同时包含 torch torchvision 的软件包.因为我对软件包管理和bash脚本不太熟悉,能否请您给出详细的答案.

I've tried pointing to different combinations of .whl files that I obtained from different sources, but got either the no module error or not enough memory. I don't understand how the modules interact in this case and why the compiler thinks there is no such module. How can I resolve this? Or alternatively, how can I compile a package myself that include both torch and torchvision. Can you please give detailed answers because I'm not very familiar with package management and bash scripting.

这是我使用的代码, torch_model.py :

from torch import nn


class EthnicityClassifier44(nn.Module):
    def __init__(self, num_classes=2):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=7, stride=1, padding=3)
        self.maxpool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv22 = nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1)
        self.maxpool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.maxpool3 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv4 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.maxpool4 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.relu = nn.ReLU(inplace=False)
        self.fc1 = nn.Linear(8*8*128, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc4 = nn.Linear(128, num_classes)


    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.maxpool1(x)
        x = self.relu(self.conv22(x))
        x = self.maxpool2(x)
        x = self.maxpool3(self.relu(self.conv3(x)))
        x = self.maxpool4(self.relu(self.conv4(x)))
        x = self.relu(self.fc1(x.view(x.shape[0], -1)))
        x = self.relu(self.fc2(x))
        x = self.fc4(x)
        return x

这是 predictor_py :

from facenet_pytorch import MTCNN, InceptionResnetV1, extract_face
import torch
import torchvision
from torchvision import transforms
from torch.nn import functional as F
from PIL import Image
from sklearn.externals import joblib
import numpy as np
import os
import torch_model


class MyPredictor(object):

    import torch
    import torchvision

    def __init__(self, model, preprocessor, device):
        """Stores artifacts for prediction. Only initialized via `from_path`.
        """
        self._resnet = model
        self._mtcnn_mult = preprocessor
        self._device = device
        self.get_std_tensor = transforms.Compose([
            np.float32,
            np.uint8,
            transforms.ToTensor(),
        ])
        self.tensor2pil = transforms.ToPILImage(mode='RGB')
        self.trans_resnet = transforms.Compose([
            transforms.Resize((100, 100)),
            np.float32,
            transforms.ToTensor()
        ])

    def predict(self, instances, **kwargs):

        pil_transform = transforms.Resize((512, 512))

        imarr = np.asarray(instances)
        pil_im = Image.fromarray(imarr)
        image = pil_im.convert('RGB')
        pil_im_512 = pil_transform(image)

        boxes, _ = self._mtcnn_mult(pil_im_512)
        box = boxes[0]

        face_tensor = extract_face(pil_im_512, box, margin=40)
        std_tensor = self.get_std_tensor(face_tensor.permute(1, 2, 0))
        cropped_pil_im = self.tensor2pil(std_tensor)

        face_tensor = self.trans_resnet(cropped_pil_im)
        face_tensor4d = face_tensor.unsqueeze(0)
        face_tensor4d = face_tensor4d.to(self._device)

        prediction = self._resnet(face_tensor4d)
        preds = F.softmax(prediction, dim=1).detach().numpy().reshape(-1)
        print('probability of (class1, class2) = ({:.4f}, {:.4f})'.format(preds[0], preds[1]))

        return preds.tolist()

    @classmethod
    def from_path(cls, model_dir):
        import torch
        import torchvision
        import torch_model

        model_path = os.path.join(model_dir, 'class44_M40RefinedExtra_bin_no_norm_7860.joblib')
        classifier = joblib.load(model_path)

        mtcnn_path = os.path.join(model_dir, 'mtcnn_mult.joblib')
        mtcnn_mult = joblib.load(mtcnn_path)

        device_path = os.path.join(model_dir, 'device_cpu.joblib')
        device = joblib.load(device_path)

        return cls(classifier, mtcnn_mult, device)

setup.py :

from setuptools import setup

REQUIRED_PACKAGES = ['opencv-python-headless', 'facenet-pytorch']

setup(
 name="my_package",
 version="0.1",
 include_package_data=True,
 scripts=["predictor.py", "torch_model.py"],
 install_requires=REQUIRED_PACKAGES
)

推荐答案

解决方案是将以下软件包放置在 setup.py 文件中,以获取自定义预测代码:

The solution was to place the following packages in thsetup.py file for the custom prediction code:

REQUIRED_PACKAGES = ['torchvision==0.5.0', 'torch @ https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl', 'opencv-python', 'facenet-pytorch']

然后,我在自定义类实例化上遇到了另一个问题,但是这篇文章对此进行了很好的解释.这样我就能够成功地将模型部署到AI平台进行预测.

I then had a different problem with custom class instantiation, but this article explains it well. So I was able to successfully deploy my model to the AI Platform for prediction.

这篇关于加载模型时发生意外错误:预测变量中的问题-ModuleNotFoundError:没有名为"torchvision"的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆