在自己的数据集上训练 Tensorflow 对象检测 [英] Train Tensorflow Object Detection on own dataset

查看:32
本文介绍了在自己的数据集上训练 Tensorflow 对象检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在花了几天时间尝试完成这项任务后,我想分享一下我如何回答这个问题的经验:

我如何使用 TS 对象检测来训练使用我自己的数据集?

解决方案

这里假设模块已经安装.如果没有,请参阅他们的文档.

免责声明

这个答案并不意味着是训练对象检测模块的正确唯一方式.这只是我分享我的经验和对我有用的东西.我愿意接受建议并了解更多相关信息,因为我对机器学习总体上还是个新手.

TL;DR

  1. 创建您自己的 PASCAL VOC 格式数据集
  2. 从中生成 TFRecords
  3. 配置管道
  4. 可视化

这个答案的每个部分都包含一个相应的编辑(见下文).阅读每个部分后,请阅读其编辑以及澄清.为每个部分添加了更正和提示.

使用的工具

LabelImg:用于创建 PASCAL VOC 格式注释的工具.

1.创建您自己的 PASCAL VOC 数据集

PS: 为简单起见,我的答案的文件夹命名约定遵循 Pascal VOC 2012

查看 2012 年 5 月数据集,您会注意到该文件夹​​具有以下结构

<代码>+VOCdevkit+VOC2012+注释+图像集+动作+布局+主要+细分+JPEG 图像+细分类+分割对象

暂时对以下文件夹进行了修改:

注释:这是所有图像对应的 XML 文件将被放入.使用上面建议的工具创建注释.不要担心 标签,因为它们会被训练和评估二进制文件忽略.

JPEGImages:实际图像的位置.确保它们是 JPEG 类型,因为这是当前支持的类型,以便使用其提供的脚本创建 TFRecord.

ImageSets->Main:这仅由文本文件组成.对于每个类,都存在相应的train.txttrainval.txtval.txt.下面是VOC 2012文件夹中aeroplane_train.txt的内容示例

2008_000008 -12008_000015 -12008_000019 -12008_000023 -12008_000028 -12008_000033 1

结构基本上是图像名称,后跟一个布尔值,表示该图像中是否存在相应的对象.例如,图像 2008_000008 不包含飞机,因此用 -1 标记,但图像 2008_000033 包含.

我编写了一个小的 Python 脚本来生成这些文本文件.只需遍历图像名称并在它们旁边分配 1 或 -1 以表示对象存在.我通过改组图像名称在我的文本文件中添加了一些随机性.

{classname}_val.txt 文件由 testing 验证数据集组成.将此视为训练期间的测试数据.您想将数据集划分为训练和验证.可以在此处找到更多信息.这些文件的格式类似于训练的格式.

此时,你的文件夹结构应该是

<代码>+VOCdevkit+VOC2012+注释--(对于每个图像,生成的注释)+图像集+主要--(对于每个类,生成*classname*_train.txt 和*classname*_val.txt)+JPEG 图像--(一堆JPEG图像)

<小时>

1.1 生成标签图

准备好数据集后,我们需要创建相应的标签图.导航到 models/object_detection/data 并打开 pascal_label_map.pbtxt.

该文件由一个 JSON 组成,该 JSON 为每个项目分配一个 ID 和名称.对此文件进行修改以反映您想要的对象.

<小时>

2.生成 TFRecords

如果您查看他们的代码,尤其是此,他们只明确获取aeroplane_train.txt.对于好奇心,这是为什么.将此文件名更改为任何班级训练文本文件.

确保 VOCdevkit 位于 models/object_detection 内,然后您可以继续生成TFRecords.

如果您遇到任何问题,请先检查他们的代码.它是不言自明且有据可查的.

<小时>

3.管道配置

说明应该是不言自明的覆盖这一部分.示例配置可以在 object_detection/samples/configs.

对于那些希望像我一样从头开始训练的人,只需确保删除 fine_tune_checkpointfrom_detection_checkpoint 节点.这是我的配置文件的样子以供参考.

从这里开始,您可以继续 教程 并运行训练过程.

<小时>

4.可视化

确保与训练并行运行评估,以便能够可视化学习过程.引用 Jonathan Huang

<块引用>

最好的方法是运行 eval.py 二进制文件.我们通常运行这个与训练并行的二进制文件,将其指向保存的目录正在训练的检查点.eval.py 二进制文件将写入记录到您指定的 eval_dir,然后您可以指向它使用 Tensorboard.

您想看到 mAP 在最初几个小时内升空",然后你想看看它什么时候收敛.很难说没有看看这些图你需要多少步骤.

<小时>

编辑 I(17 年 7 月 28 日):

我没想到我的回复会引起如此多的关注,所以我决定回来回顾一下.

工具

对于我的 Apple 用户,您实际上可以使用 RectLabel 用于注释.

帕斯卡 VOC

经过仔细研究,我终于意识到trainval.txt实际上是训练和验证数据集的结合.

请查看他们的官方开发工具包以更好地理解格式.>

标签地图生成

在我撰写本文时,ID 0 代表 none_of_the_above.建议您的 ID 从 1 开始.

可视化

在运行您的评估并将张量板定向到您的 Eval 目录后,它会向您显示每个类别的 mAP 以及每个类别的性能.这很好,但我也喜欢与 Eval 并行查看我的训练数据.

为此,请在不同的端口上运行 tensorboard 并将其指向您的火车目录

tensorboard --logdir=${PATH_TO_TRAIN} --port=${DESIRED_NUMBER}

After spending a couple days trying to achieve this task, I would like to share my experience of how I went about answering the question:

How do I use TS Object Detection to train using my own dataset?

解决方案

This assumes the module is already installed. Please refer to their documentation if not.

Disclaimer

This answer is not meant to be the right or only way of training the object detection module. This is simply I sharing my experience and what has worked for me. I'm open to suggestions and learning more about this as I am still new to ML in general.

TL;DR

  1. Create your own PASCAL VOC format dataset
  2. Generate TFRecords from it
  3. Configure a pipeline
  4. Visualize

Each section of this answer consists of a corresponding Edit (see below). After reading each section, please read its Edit as well for clarifications. Corrections and tips were added for each section.

Tools used

LabelImg: A tool for creating PASCAL VOC format annotations.

1. Create your own PASCAL VOC dataset

PS: For simplicity, the folder naming convention of my answer follows that of Pascal VOC 2012

A peek into the May 2012 dataset, you'll notice the folder as having the following structure

+VOCdevkit +VOC2012 +Annotations +ImageSets +Action +Layout +Main +Segmentation +JPEGImages +SegmentationClass +SegmentationObject

For the time being, amendments were made to the following folders:

Annotations: This is were all the images' corresponding XML files will be placed in. Use the suggested tool above to create the annotations. Do not worry about <truncated> and <difficulty> tags as they will be ignored by the training and eval binaries.

JPEGImages: Location of your actual images. Make sure they are of type JPEG because that's what is currently supported in order to create TFRecords using their provided script.

ImageSets->Main: This simply consists of text files. For each class, there exists a corresponding train.txt, trainval.txt and val.txt. Below is a sample of the contents of the aeroplane_train.txt in the VOC 2012 folder

2008_000008 -1
2008_000015 -1
2008_000019 -1
2008_000023 -1
2008_000028 -1
2008_000033  1

The structure is basically image name followed by a boolean saying whether the corresponding object exists in that image or not. Take for example image 2008_000008 does not consist of an aeroplane hence marked with a -1 but image 2008_000033 does.

I wrote a small Python script to generate these text files. Simply iterate through the image names and assign a 1 or -1 next to them for object existence. I added some randomness among my text files by shuffling the image names.

The {classname}_val.txt files consist of the testing validation datasets. Think of this as the test data during training. You want to divide your dataset into training and validation. More info can be found here. The format of these files is similar to that of training.

At this point, your folder structure should be

+VOCdevkit +VOC2012 +Annotations --(for each image, generated annotation) +ImageSets +Main --(for each class, generated *classname*_train.txt and *classname*_val.txt) +JPEGImages --(a bunch of JPEG images)


1.1 Generating label map

With the dataset prepared, we need to create the corresponding label maps. Navigate to models/object_detection/data and open pascal_label_map.pbtxt.

This file consists of a JSON that assigns an ID and name to each item. Make amendments to this file to reflect your desired objects.


2. Generate TFRecords

If you look into their code especially this line, they explicitly grab the aeroplane_train.txt only. For curios minds, here's why. Change this file name to any of your class train text file.

Make sure VOCdevkit is inside models/object_detection then you can go ahead and generate the TFRecords.

Please go through their code first should you run into any problems. It is self explanatory and well documented.


3. Pipeline Configuration

The instructions should be self explanatory to cover this segment. Sample configs can be found in object_detection/samples/configs.

For those looking to train from scratch as I did, just make sure to remove the fine_tune_checkpoint and from_detection_checkpoint nodes. Here's what my config file looked like for reference.

From here on you can continue with the tutorial and run the training process.


4. Visualize

Be sure to run the eval in parallel to the training in order to be able to visualize the learning process. To quote Jonathan Huang

the best way is to just run the eval.py binary. We typically run this binary in parallel to training, pointing it at the directory holding the checkpoint that is being trained. The eval.py binary will write logs to an eval_dir that you specify which you can then point to with Tensorboard.

You want to see that the mAP has "lifted off" in the first few hours, and then you want to see when it converges. It's hard to tell without looking at these plots how many steps you need.


EDIT I (28 July '17):

I never expected my response to get this much attention so I decided to come back and review it.

Tools

For my fellow Apple users, you could actually use RectLabel for annotations.

Pascal VOC

After digging around, I finally realized that trainval.txt is actually the union of training and validation datasets.

Please look at their official development kit to understand the format even better.

Label Map Generation

At the time of my writing, ID 0 represents none_of_the_above. It is recommended that your IDs start from 1.

Visualize

After running your evaluation and directed tensorboard to your Eval directory, it'll show you the mAP of each category along with each category's performance. This is good but I like seeing my training data as well in parallel with Eval.

To do this, run tensorboard on a different port and point it to your train directory

tensorboard --logdir=${PATH_TO_TRAIN} --port=${DESIRED_NUMBER}

这篇关于在自己的数据集上训练 Tensorflow 对象检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆