Tensorflow对象检测API-模型的微调是如何工作的? [英] Tensorflow Object-Detection API - How does the Fine-Tuning of a model works?

查看:221
本文介绍了Tensorflow对象检测API-模型的微调是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是有关Tensorflow对象检测API的更一般的问题.

我正在使用此API,更具体地说,我将模型微调到我的数据集.根据API的描述,我使用model_main.py函数从给定的检查点/冻结图重新训练模型.

但是,我不清楚微调在API中如何工作.最后一层的自动初始化会自动发生吗,还是我必须实现类似的东西? 在README文件中,我没有找到有关此主题的任何提示.也许有人可以帮助我.

解决方案

从分步训练或从检查点进行训练,model_main.py是主程序,除了该程序外,您所需要的只是正确的管道配置文件.

因此,对于微调,它可以被分成两个步骤,恢复权重和更新权重.可以根据火车 proto文件,此原型对应于管道配置文件中的train_config.

train_config: {
   batch_size: 24
   optimizer { }
   fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
   fine_tune_checkpoint_type:  "detection"
   # Note: The below line limits the training process to 200K steps, which we
   # empirically found to be sufficient enough to train the pets dataset. This
   # effectively bypasses the learning rate schedule (the learning rate will
   # never decay). Remove the below line to train indefinitely.
   num_steps: 200000
   data_augmentation_options {}
 }

第1步,恢复权重.

在此步骤中,您可以通过设置fine_tune_checkpoint_type来配置要还原的变量,选项为detectionclassification.通过将其设置为detection,基本上可以从检查点还原几乎所有变量,并将其设置为classification,则仅还原来自feature_extractor范围的变量,(骨干网中的所有层,如VGG,Resnet ,MobileNet,它们称为功能提取器).

以前,这是由from_detection_checkpointload_all_detection_checkpoint_vars控制的,但是这两个字段已弃用.

还请注意,在配置fine_tune_checkpoint_type之后,实际的还原操作将检查检查点中是否存在图形中的变量,如果不存在,则将使用常规初始化操作来初始化该变量.

举个例子,假设您要微调ssd_mobilenet_v1_custom_data模型并下载了检查点ssd_mobilenet_v1_coco,那么当您设置fine_tune_checkpoint_type: detection时,图形中所有在检查点文件中可用的变量都将可以恢复,并且框预测变量(最后一层)的权重也将恢复.但是,如果设置fine_tune_checkpoint_type: classification,则仅恢复mobilenet图层的权重.但是,如果您使用不同的模型检查点,例如faster_rcnn_resnet_xxx,则由于该图中的变量在该检查点中不可用,您将看到输出日志显示Variable XXX is not available in checkpoint警告,并且将无法还原它们.

第2步,更新权重

现在,您已经恢复了所有权重,并且想要继续对自己的数据集进行训练(微调),通常这应该就足够了.

但是,如果您想尝试一些东西,并且想要在训练过程中冻结某些图层,则可以通过设置freeze_variables来自定义训练.假设您要冻结移动网络的所有权重并仅更新框式预测器的权重,则可以设置freeze_variables: [feature_extractor],以使名称中具有feature_extractor的所有变量都不会被更新.有关详细信息,请参阅我写的另一个答案.

因此,要微调自定义数据集上的模型,您应该准备一个自定义配置文件.您可以从示例配置文件开始,然后进行修改一些适合您需要的字段.

This is a more general question about the Tensorflow Object-Detection API.

I am using this API, to be more concrete I fine-tune a model to my dataset. According to the description of the API, I use the model_main.py function to retrain a model from a given checkpoint/frozen graph.

However, it is not clear for me how the fine-tuning is working within the API. Does a re-initialization of the last layer happen automatically or do I have to implement something like ? In the README files I did not find any hint concerning this topic. Maybe somebody could help me.

解决方案

Training from stratch or training from a checkpoint, model_main.py is the main program, besides this program, all you need is a correct pipeline config file.

So for fine-tuning, it can be separated into two steps, restoring weights and updating weights. Both steps can be customly configured according to the train proto file, this proto corresponds to train_config in the pipeline config file.

train_config: {
   batch_size: 24
   optimizer { }
   fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
   fine_tune_checkpoint_type:  "detection"
   # Note: The below line limits the training process to 200K steps, which we
   # empirically found to be sufficient enough to train the pets dataset. This
   # effectively bypasses the learning rate schedule (the learning rate will
   # never decay). Remove the below line to train indefinitely.
   num_steps: 200000
   data_augmentation_options {}
 }

Step 1, restoring weights.

In this step, you can config the variables to be restored by setting fine_tune_checkpoint_type, the options are detection and classification. By setting it to detection essentially you can restore almost all variables from the checkpoint, and by setting it to classification, only variables from the feature_extractor scope are restored, (all the layers in backbone networks, like VGG, Resnet, MobileNet, they are called feature extractors).

Previously this is controlled by from_detection_checkpoint and load_all_detection_checkpoint_vars, but these two fields are deprecated.

Also notice that after you configured the fine_tune_checkpoint_type, the actual restoring operation would check if the variable in the graph exists in the checkpoint, and if not, the variable would be initialized with routine initialization operation.

Give an example, suppose you want to fine-tune a ssd_mobilenet_v1_custom_data model and you downloaded the checkpoint ssd_mobilenet_v1_coco, when you set fine_tune_checkpoint_type: detection, then all variables in the graph that are also available in the checkpoint file will be restored, and the box predictor (last layer) weights will also be restored. But if you set fine_tune_checkpoint_type: classification, then only the weights for mobilenet layers are restored. But if you use a different model checkpoint, say faster_rcnn_resnet_xxx, then because variables in the graph are not available in the checkpoint, you will see the output log saying Variable XXX is not available in checkpoint warning, and they won't be restored.

Step 2, updating weights

Now you have all weights restored and you want to keep training (fine-tuning) on your own dataset, normally this should be enough.

But if you want to experiment with something and you want to freeze some layers during training, then you can customize the training by setting freeze_variables. Say you want to freeze all the weights of the mobilenet and only updating the weights for the box predictor, you can set freeze_variables: [feature_extractor] so that all variables that have feature_extractor in their names won't be updated. For detailed info, please see another answer that I wrote.

So to fine-tune a model on your custom dataset, you should prepare a custom config file. You can start with the sample config files and then modify some fields to suit your needs.

这篇关于Tensorflow对象检测API-模型的微调是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆