TensorFlow:如何以及为何使用 SavedModel [英] TensorFlow: How and why to use SavedModel

查看:71
本文介绍了TensorFlow:如何以及为何使用 SavedModel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个关于 SavedModel API 的问题,其文档 我发现很多细节无法解释.

前三个问题是关于将什么传递给tf.saved_model.builder.SavedModelBuilderadd_meta_graph_and_variables()方法的参数,而第四个问题是关于为什么要使用 SavedModel API 而不是 tf.train.Saver.

  1. signature_def_map 参数的格式是什么?保存模型时我通常需要设置这个参数吗?

  2. 同样,assets_collection 参数的格式是什么?

  3. 为什么要保存带有元图的标签列表,而不是只给它一个名称(即只附加一个唯一的标签)?为什么要向给定的元图添加多个标签?如果我尝试通过某个标签从 pb 加载元图,但该 pb 中的多个元图与该标签匹配怎么办?

  4. 文档认为,建议使用 SavedModel 将整个模型(而不是仅变量)保存在自包含文件中.但是 tf.train.Saver 除了保存 .meta 文件中的变量外,还会保存图表.那么使用SavedModel有什么好处呢?文档说

<块引用>

当您要保存和加载变量、图形和图形的元数据——基本上,当你想要保存或恢复你的模型时——我们推荐使用 SavedModel.SavedModel 是一种语言中立的,可恢复的、密封的序列化格式.SavedModel 启用用于生产、消费和转换的更高级别的系统和工具TensorFlow 模型.

但是这个解释很抽象,并不能真正帮助我理解SavedModel 的优点是什么.SavedModel(相对于 tf.train.Saver)更适合使用的具体示例是什么?

请注意,我的问题不是这个问题的重复.我不是在问如何保存模型,而是在问关于 SavedModel 属性的非常具体的问题,这只是 TensorFlow 提供的用于保存和加载模型的多种机制之一.链接问题中的所有答案均未涉及 SavedModel API(再次与 tf.train.Saver 不同).

解决方案

EDIT:我在 TensorFlow 1.4 回写了这个.截至今天(TensorFlow 1.12 稳定,1.13rc 和 2.0 即将推出)问题中链接的文档已大大改进.

<小时>

我正在尝试使用 tf.saved_model 并且还发现文档非常(太)抽象.这是我对您问题的完整回答:

1.signature_def_map:

一个.格式参见汤姆对Tensorflow:如何保存的回答/恢复模型.(Ctrl-F 表示tf.saved_model"——目前,该问题的短语的唯一用法是在他的回答中).

b.需要 据我了解,您通常确实需要它.如果您打算使用该模型,您需要知道图形的输入和输出.我认为它类似于 C++ 函数签名:如果您打算在调用后或在另一个 C++ 文件中定义一个函数,则需要在主文件中(即原型或头文件)中的签名.

2.assets_collection:

格式:找不到明确的文档,所以我去了构建器源代码.该参数似乎是 dtype=tf.string 的张量的可迭代对象,其中每个张量是资产目录的路径.因此,TensorFlow 图形集合 应该可以工作.我想这是参数的同名,但从源代码来看,我希望 Python list 也能工作.

(您没有问是否需要设置它,但从 Zoe 对 tensorflow 中的资产是什么? 和 iga 对切线相关的回答 Tensorflow 服务:导出模型时没有要保存/写入的资产",通常不需要设置.)

3.标签:

一个.为什么要列出 我不知道为什么你必须传递一个列表,但你可以传递一个包含一个元素的列表.例如,在我当前的项目中,我只使用 [tf...tag_constants.SERVING] 标签.

b.何时使用多个 假设您使用显式设备放置进行操作.也许您想保存图形的 CPU 版本和 GPU 版本.显然你想保存每个的服务版本,并说你想保存训练检查点.您可以使用 CPU/GPU 标签和训练/服务标签来管理所有案例.docs 提示:

<块引用>

添加到 SavedModel 的每个 MetaGraphDef 都必须使用用户指定的标签进行注释.标签提供了一种方法来识别要加载和恢复的特定 MetaGraphDef,以及共享的变量和资产集.这些标签通常用其功能(例如,服务或训练)来注释 MetaGraphDef,并可选择使用特定于硬件的方面(例如,GPU).

c.碰撞懒得自己强制碰撞 - 我看到两个需要解决的情况 - 我去了加载程序 源代码.在 def load 里面,你会看到:

saved_model = _parse_saved_model(export_dir)found_match = 假对于saved_model.meta_graphs 中的meta_graph_def:如果设置(meta_graph_def.meta_info_def.tags)==设置(标签):meta_graph_def_to_load = meta_graph_deffound_match = 真休息如果没有 found_match:引发运行时错误("与标签关联的 MetaGraphDef" + str(tags).strip("[]") +在 SavedModel 中找不到.要检查可用的标签集"" SavedModel, 请使用 SavedModel CLI: `saved_model_cli`")

在我看来,它正在寻找完全匹配.例如.假设您有一个带有GPU"和Serving"标签的元图和一个带有Serving"标签的元图.如果您加载Serving",您将获得后一个元图.另一方面,假设您有一个元图GPU"和服务"以及一个元图CPU"和服务".如果您尝试加载Serving",则会出现错误.如果您尝试将两个具有完全相同标签的元图保存在同一文件夹中,我希望您会覆盖第一个.看起来构建代码并没有以任何特殊的方式处理这种冲突.

4.SavedModeltf.train.Saver:

这也让我很困惑.wicke 对 TensorFlow 用户是否应该更喜欢 SavedModel 而不是 Checkpoint 或 GraphDef? 为我清除了它.我会投入我的两分钱:

在本地Python+TensorFlow的范围内,你可以让tf.train.Saver做任何事情.但是,这会让你付出代价.让我概述保存训练模型和部署用例.您将需要您的保护程序对象.将其设置为保存完整图形(每个变量)是最简单的.您可能不想一直保存 .meta ,因为您正在使用静态图.你需要在你的训练挂钩中指定它.您可以阅读 on cv-tricks.训练完成后,您需要将检查点文件转换为 pb 文件.这通常意味着清除当前图形、恢复检查点、使用 tf.python.framework.graph_util 将变量冻结为常量,并使用 tf.gfile.GFile 编写它.您可以在媒体上阅读/a>.之后,您想在 Python 中部署它.您将需要输入和输出张量名称 - 图 def 中的字符串名称.您可以在 on metaflow 上阅读相关内容(实际上是一篇关于 tf.train.Saver 方法的非常好的博客文章).一些 op 节点可以让您轻松地向其中输入数据.有些没有那么多.我通常会放弃寻找合适的节点并添加一个 tf.reshape ,它实际上并未对图形定义进行任何重塑.那是我的临时输入节点.输出也一样.最后,您可以部署您的模型,至少在本地使用 Python.

或者,您可以使用我在第 1 点中链接的答案通过 SavedModel API 完成所有这些.少头疼感谢汤姆的回答.如果它得到适当的记录,您将在未来获得更多支持和功能.看起来使用命令行服务更容易(中等链接涵盖了使用 Saver 执行此操作 - 看起来很难,祝你好运!).它实际上已经融入了新的 Estimator.根据文档,

<块引用>

SavedModel 是一种语言中立、可恢复、密封的序列化格式.

强调我的:看起来您可以更轻松地将经过训练的模型用于不断增长的 C++ API.

在我看来,它就像 Datasets API.它比旧方法更容易

tf.train.SaverSavedModel的具体例子而言:如果基本上,什么时候你想保存或恢复你的模型"还不够清楚对您而言:使用它的正确时间是任何时候它都能让您的生活更轻松.对我来说,这看起来像往常一样.特别是如果您使用 Estimators、在 C++ 中部署或使用命令行服务.

这就是我对你的问题的研究.或四个列举的问题.呃,八个问号.希望这会有所帮助.

I have a few questions regarding the SavedModel API, whose documentation I find leaves a lot of details unexplained.

The first three questions are about what to pass to the arguments of the add_meta_graph_and_variables() method of tf.saved_model.builder.SavedModelBuilder, while the fourth question is about why to use the SavedModel API over tf.train.Saver.

  1. What is the format of the signature_def_map argument? Do I normally need to set this argument when saving a model?

  2. Similarly, What is the format of the assets_collection argument?

  3. Why do you save a list of tags with a metagraph as opposed to just giving it a name (i.e. attaching just one unique tag to it)? Why would I add multiple tags to a given metagraph? What if I try to load a metagrpah from a pb by a certain tag, but multiple metagraphs in that pb match that tag?

  4. The documentation argues that it is recommended to use SavedModel to save entire models (as opposed to variables only) in self-contained files. But tf.train.Saver also saves the graph in addition to the variables in a .meta file. So what are the advantages of using SavedModel? The documentation says

When you want to save and load variables, the graph, and the graph's metadata--basically, when you want to save or restore your model--we recommend using SavedModel. SavedModel is a language-neutral, recoverable, hermetic serialization format. SavedModel enables higher-level systems and tools to produce, consume, and transform TensorFlow models.

but this explanation is quite abstract and doesn't really help me understand what the advantages of SavedModel are. What would be concrete examples where SavedModel (as opposed to tf.train.Saver) would be better to use?

Please note that my question is not a duplicate of this question. I'm not asking how to save a model, I am asking very specific questions about the properties of SavedModel, which is only one of multiple mechanisms TensorFlow provides to save and load models. None of the answers in the linked question touch on the SavedModel API (which, once again, is not the same as tf.train.Saver).

解决方案

EDIT: I wrote this back at TensorFlow 1.4. As of today (TensorFlow 1.12 is stable, there's a 1.13rc and 2.0 is around the corner) the docs linked in the question are much improved.


I'm trying to use tf.saved_model and also found the Docs quite (too) abstract. Here's my stab at a full answer to your questions:

1. signature_def_map:

a. Format See Tom's answer to Tensorflow: how to save/restore a model. (Ctrl-F for "tf.saved_model" - currently, the only uses of the phrase on that question are in his answer).

b. need It's my understanding that you do normally need it. If you intend to use the model, you need to know the inputs and outputs of the graph. I think it is akin to a C++ function signature: If you intend to define a function after it's called or in another C++ file, you need the signature in your main file (i.e. prototyped or in a header file).

2. assets_collection:

format: Couldn't find clear documentation, so I went to the builder source code. It appears that the argument is an iterable of Tensors of dtype=tf.string, where each Tensor is a path for the asset directory. So, a TensorFlow Graph collection should work. I guess that is the parameter's namesake, but from the source code I would expect a Python list to work too.

(You didn't ask if you need to set it, but judging from Zoe's answer to What are assets in tensorflow? and iga's answer to the tangentially related Tensorflow serving: "No assets to save/writes" when exporting models, it doesn't usually need set.)

3. Tags:

a. Why list I don't know why you must pass a list, but you may pass a list with one element. For instance, in my current project I only use the [tf...tag_constants.SERVING] tag.

b. When to use multiple Say you're using explicit device placement for operations. Maybe you want to save a CPU version and a GPU version of your graph. Obviously you want to save a serving version of each, and say you want to save training checkpoints. You could use a CPU/GPU tag and a training/serving tag to manage all cases. The docs hint at it:

Each MetaGraphDef added to the SavedModel must be annotated with user-specified tags. The tags provide a means to identify the specific MetaGraphDef to load and restore, along with the shared set of variables and assets. These tags typically annotate a MetaGraphDef with its functionality (for example, serving or training), and optionally with hardware-specific aspects (for example, GPU).

c. Collision Too lazy to force a collision myself - I see two cases that would need addressed - I went to the loader source code. Inside def load, you'll see:

saved_model = _parse_saved_model(export_dir)
found_match = False
for meta_graph_def in saved_model.meta_graphs:
  if set(meta_graph_def.meta_info_def.tags) == set(tags):
    meta_graph_def_to_load = meta_graph_def
    found_match = True
    break

if not found_match:
  raise RuntimeError(
      "MetaGraphDef associated with tags " + str(tags).strip("[]") +
      " could not be found in SavedModel. To inspect available tag-sets in"
      " the SavedModel, please use the SavedModel CLI: `saved_model_cli`"
  )

It appears to me that it's looking for an exact match. E.g. say you have a metagraph with tags "GPU" and "Serving" and a metagraph with tag "Serving". If you load "Serving", you'll get the latter metagraph. On the other hand, say you have a metagraph "GPU" and "Serving" and a metagraph "CPU" and "Serving". If you try to load "Serving", you'll get the error. If you try to save two metagraphs with the exact same tags in the same folder, I expect you'll overwrite the first one. It doesn't look like the build code handles such a collision in any special way.

4. SavedModel or tf.train.Saver:

This confused me too. wicke's answer to Should TensorFlow users prefer SavedModel over Checkpoint or GraphDef? cleared it up for me. I'll throw in my two cents:

In the scope of local Python+TensorFlow, you can make tf.train.Saver do everything. But, it will cost you. Let me outline the save-a-trained-model-and-deploy use case. You'll need your saver object. It's easiest to set it up to save the complete graph (every variable). You probably don't want to save the .meta all the time since you're working with a static graph. You'll need to specify that in your training hook. You can read about that on cv-tricks. When your training finishes, you'll need convert your checkpoint file to a pb file. That usually means clearing the current graph, restoring the checkpoint, freezing your variables to constants with tf.python.framework.graph_util, and writing it with tf.gfile.GFile. You can read about that on medium. After that, you want to deploy it in Python. You'll need the input and output Tensor names - the string names in the graph def. You can read about that on metaflow (actually a very good blog post for the tf.train.Saver method). Some op nodes will let you feed data into them easily. Some not so much. I usually gave up on finding an appropriate node and added a tf.reshape that didn't actually reshape anything to the graph def. That was my ad-hoc input node. Same for the output. And then finally, you can deploy your model, at least locally in Python.

Or, you could use the answer I linked in point 1 to accomplish all this with the SavedModel API. Less headaches thanks to Tom's answer . You'll get more support and features in the future if it ever gets documented appropriately . Looks like it's easier to use command line serving (the medium link covers doing that with Saver - looks tough, good luck!). It's practically baked in to the new Estimators. And according to the Docs,

SavedModel is a language-neutral, recoverable, hermetic serialization format.

Emphasis mine: Looks like you can get your trained models into the growing C++ API much easier.

The way I see it, it's like the Datasets API. It's just easier than the old way!

As far as concrete examples of SavedModel of tf.train.Saver: If "basically, when you want to save or restore your model" isn't clear enough for you: The correct time to use it is any time it makes your life easier. To me, that looks like always. Especially if you're using Estimators, deploying in C++, or using command line serving.

So that's my research on your question. Or four enumerated questions. Err, eight question marks. Hope this helps.

这篇关于TensorFlow:如何以及为何使用 SavedModel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆