在Azure ML服务vm中保存和恢复Tensorflow模型时出现问题 [英] Problem in saving and restoring Tensorflow models in Azure ML service vm

查看:69
本文介绍了在Azure ML服务vm中保存和恢复Tensorflow模型时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用以下脚本保存tensorflow模型文件:

 

saver = tf.train.Saver()

saver.save(sess,model_file_name)

运行此脚本后,它会存储所有主要内容文件(checkpoint,model.data ...,model.meta和model.index),它还为每个文件存储一些临时文件。 


但是经过几秒钟后删除我想用我的图表恢复的model.meta文件。 (并且恢复脚本找不到文件)


我甚至尝试使用以下脚本在保存文件后立即上传并注册模型:


 

run.upload_file(name = model_file_name,path_or_stream = model_file_name)

model = run。 register_model(model_name ='nn_model',model_path = model_file_name


但是当我尝试使用以下脚本访问模型时,它找不到文件:(并且它已从vm本地消失)<来自azureml.core.model import Model

model_path = Model.get_model_path(model_name ='nn_model')

PS:之前我使用此方法保存和恢复tf模型,不使用azure服务,它完美运行。但是我不知道如何在天蓝色中实现这个问题,问题是什么。 任何人都可以帮我这个吗? 

解决方案

Hello Zahra,


有趣的是,文件在为会话保存后会被删除。您是否将会话详细信息保存在VM的临时存储中? 


这是
示例
从注册表中获取模型在批处理管道中恢复它。此外,如果为运行注册了模型,您应该能够在工作区实验中从该门户查看该运行ID。


此外,
get_model_path
根据传递给它的参数返回模型的路径。这是一个简单的两部分笔记本到

train

deploy
一个模型,可以帮助您将模型保存到工作区并稍后加载以进行部署。我希望这有帮助!!


-Rohit


I tried to save tensorflow model files with the following script:

saver = tf.train.Saver()

saver.save(sess, model_file_name)

After running this script it stores all the main files (checkpoint, model.data... , model.meta and model.index) and it also stores some temp files for each of them. 

But after a few second azure deletes my model.meta file which I want to restore my graph with. (And the restoring script cannot find the file)

I even tried to upload and register the model in the azure run immediately after saving files, using this script:

run.upload_file(name = model_file_name , path_or_stream = model_file_name )

model = run.register_model(model_name='nn_model', model_path=model_file_name)

But when I try to access to the model with the following script, it cannot find the file: (And it's gone from the vm local)

from azureml.core.model import Model

model_path = Model.get_model_path(model_name='nn_model')

P.S: I used this method for saving and restoring tf models before, without using azure services and it works perfectly. But I'm not sure how to implement this in azure, and what's the problem. Can anyone help me with this? 

解决方案

Hello Zahra,

It is interesting to note that the files are deleted after they are saved for your session. Are you saving the session details on the temp storage of the VM? 

Here is a sample of getting the model from registry and restoring it in a batch pipeline. Also, if the model is registered for the run you should be able to view the same from portal for that run id in the workspace experiment.

Also, get_model_path returns the path to the model based on the parameters that are passed to it. Here is a simple two part notebooks to train and deploy a model that can help you in saving your model to workspace and loading it later for deployment. I hope this helps!!

-Rohit


这篇关于在Azure ML服务vm中保存和恢复Tensorflow模型时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆