遮罩R-CNN用于对象检测和分割[自定义数据集训练] [英] Mask R-CNN for object detection and segmentation [Train for a custom dataset]

查看:78
本文介绍了遮罩R-CNN用于对象检测和分割[自定义数据集训练]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究"Mask R-CNN用于对象检测和分割".因此,我已阅读了提出用于对象检测的Mask R-CNN的原始研究论文,并且还发现了Mask R-CNN的一些实现,此处(由Facebook AI研究小组称为Detectron).但是他们都使用了可可数据集进行测试.

I'm doing a research on "Mask R-CNN for Object Detection and Segmentation". So I have read the original research paper which presents Mask R-CNN for object detection, and also I found few implementations of Mask R-CNN, here and here (by Facebook AI research team called detectron). But they all have used coco datasets for testing.

但是,对于使用自定义数据集进行上述实现的培训,我颇为困惑,自定义数据集包含大量图像,并且对于每个图像,都有一个遮罩图像子集,用于标记相应图像中的对象.

But I'm quite a bit of confusing for training above implementations with custom data-set which has a large set of images and for each image there is a subset of masks images for marking the objects in the corresponding image.

因此,很高兴有人可以为此任务发布有用的资源或代码示例.

So I'm pleasure if anyone can post useful resources or code samples for this task.

注意:我的数据集具有以下结构,

Note: My dataset has following structure,

它包含大量图像,对于每个图像, 是单独的图像文件,将对象突出显示为白色补丁 黑色图像.

It consists with a large number of images and for each image, there are separate image files highlighting the object as a white patch in a black image.

这是一个示例图像,它是蒙版:

Here is an example image and it's masks:

图片;

面具;

推荐答案

我已经训练了 https://github. com/matterport/Mask_RCNN 的实例细分模型可在我的数据集上运行.

I have trained https://github.com/matterport/Mask_RCNN 's model for instance segmentation to run on my dataset.

我的假设是,您已经完成了所有基本设置,并且该模型已经使用默认数据集(在存储库中提供)运行,现在您希望它针对自定义数据集运行.

My assumption is that you have all the basic setup done and the model is already running with default dataset(provided in the repo) and now you want it to run for custom dataset.

以下是步骤

  1. 您需要具有所有注释.
  2. 所有这些都需要转换为VGG多边形模式(是的,即使您需要装订框,我的意思是多边形).我在此答案的末尾添加了示例VGG多边形格式.
  3. 您需要将自定义数据集分为train,test和val
  4. 默认情况下,注释在单个数据集文件夹中的文件名为via_region_data.json.例如,对于训练图像,请查看train\via_region_data.json.您也可以根据需要进行更改.
  5. 在示例"文件夹内,您可以找到气球,核,形状等文件夹.复制其中一个文件夹.最好是气球.现在,我们将尝试为自定义数据集修改此新文件夹.
  6. 在复制的文件夹中,您将有一个.py文件(对于Balloon,它将是Balloon.py),更改以下变量
    • ROOT_DIR:克隆项目的绝对路径
    • DEFAULT_LOGS_DIR:此文件夹的大小将变大,因此请相应地更改此路径(如果在低磁盘存储VM中运行代码).它还将存储.h5文件.它将在带有时间戳的日志文件夹内创建子文件夹.
    • 每个时期
    • .h5文件大约为200-300 MB.但是,请猜测此日志目录与Tensorboard兼容.您可以在运行tensorboard时将带有时间戳的子文件夹作为--logdir参数传递.
  1. You need to have all your annotations.
  2. All of those need to be converted to VGG Polygon schema (yes i mean polygons, even if you need bound boxes). I have added a sample VGG Polygon format at the end of this answer.
  3. You need to divide your custom dataset into train, test and val
  4. The annotation by default are looked with a filename via_region_data.json inside the individual dataset folder. For eg for training images it would look at train\via_region_data.json. You can also change it if you want.
  5. Inside Samples folder you can find folders like Balloon, Nucleus, Shapes etc. Copy one of the folders. Preferably balloon. We will now try to modify this new folder for our custom dataset.
  6. Inside the copied folder, you will have a .py file (for balloon it will be balloon.py), change the following variables
    • ROOT_DIR : the absolute path where you have cloned the project
    • DEFAULT_LOGS_DIR : This folder will get bigger in size so change this path accordingly (if you are running your code in a low disk storage VM). It will store the .h5 file as well. It will make subfolder inside the log folder with timestamp attached to it.
    • .h5 files are roughly 200 - 300 MB per epoch. But guess what this log directory is Tensorboard compatible. You can pass the timestamped subfolder as --logdir argument while running tensorboard.
  • NAME:您的项目的名称.
  • NUM_CLASSES:它应该比您的标签类多一个,因为背景也被视为一个标签
  • DETECTION_MIN_CONFIDENCE:默认为0.9(如果您的训练图像质量不是很高或没有太多的训练数据,请降低该值)
  • STEPS_PER_EPOCH
  • NAME : a name for your project.
  • NUM_CLASSES : it should be one more than your label class because background is also considered as one label
  • DETECTION_MIN_CONFIDENCE : by default 0.9 (decrease it if your training images are not of very high quality or you don't have much training data)
  • STEPS_PER_EPOCH etc
  • load_(样品样本的名称),例如load_balloon
  • load_mask(请参阅答案的最后一个示例)
  • image_reference

您现在可以直接在终端上运行它

You can now run it directly from terminal

python samples\your_folder_name\your_python_file_name.py train --dataset="location_of_custom_dataset" --weights=coco

有关上一行命令行参数的完整信息,您可以在此.py文件顶部的注释中看到它.

For complete information of the command line arguments for the above line you can see it as a comment at the top of this .py file.

这些是我记得的事情,我想记得要增加更多的步骤.也许您可以让我知道您是否停留在任何特定步骤,我将详细说明该特定步骤.

These are the things which I could recall, I would like to add more steps as I remember. Maybe you can let me know if you are stuck at any particular step, I will elaborate that particular step.

VGG多边形架构

宽度和高度是可选的

[{
    "filename": "000dfce9-f14c-4a25-89b6-226316f557f3.jpeg",
    "regions": {
        "0": {
            "region_attributes": {
                "object_name": "Cat"
            },
            "shape_attributes": {
                "all_points_x": [75.30864197530865, 80.0925925925926, 80.0925925925926, 75.30864197530865],
                "all_points_y": [11.672189112257607, 11.672189112257607, 17.72093488703078, 17.72093488703078],
                "name": "polygon"
            }
        },
        "1": {
            "region_attributes": {
                "object_name": "Cat"
            },
            "shape_attributes": {
                "all_points_x": [80.40123456790124, 84.64506172839506, 84.64506172839506, 80.40123456790124],
                "all_points_y": [8.114103362391036, 8.114103362391036, 12.205901974737595, 12.205901974737595],
                "name": "polygon"
            }
        }
    },
    "width": 504,
    "height": 495
}]

示例load_mask函数

def load_mask(self, image_id):
    """Generate instance masks for an image.
    Returns:
    masks: A bool array of shape [height, width, instance count] with
        one mask per instance.
    class_ids: a 1D array of class IDs of the instance masks.
    """
    # If not your dataset image, delegate to parent class.
    image_info = self.image_info[image_id]
    if image_info["source"] != "name_of_your_project":   //change your project name
        return super(self.__class__, self).load_mask(image_id)

    # Convert polygons to a bitmap mask of shape
    # [height, width, instance_count]
    info = self.image_info[image_id]
    mask = np.zeros([info["height"], info["width"], len(info["polygons"])], dtype=np.uint8)
    class_id =  np.zeros([mask.shape[-1]], dtype=np.int32)

    for i, p in enumerate(info["polygons"]):
        # Get indexes of pixels inside the polygon and set them to 1
        rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
        # print(rr.shape, cc.shape, i, np.ones([mask.shape[-1]], dtype=np.int32).shape, info['classes'][i])

        class_id[i] = self.class_dict[info['classes'][i]]
        mask[rr, cc, i] = 1


    # Return mask, and array of class IDs of each instance. Since we have
    # one class ID only, we return an array of 1s
    return mask.astype(np.bool), class_id

这篇关于遮罩R-CNN用于对象检测和分割[自定义数据集训练]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆