Detectron2 - 在目标检测阈值处提取区域特征 [英] Detectron2 - Extract region features at a threshold for object detection

查看:23
本文介绍了Detectron2 - 在目标检测阈值处提取区域特征的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 detectron2 框架提取类别检测高于某个阈值的区域特征.我稍后将在我的管道中使用这些功能(类似于:VilBert 部分 3.1 培训 ViLBERT)到目前为止,我已经用这个 训练了一个 Mask R-CNNconfig 并根据一些自定义数据对其进行微调.它表现良好.我想要做的是从我训练的模型中为生成的边界框提取特征.

I am trying to extract region features where class detection is higher than some threshold using the detectron2 framework. I will be using these features later in my pipeline (similar to: VilBert section 3.1 Training ViLBERT) So far I have trained a Mask R-CNN with this config and fine-tuned it on some custom data. It performs well. What I would like to do is extract the features from my trained model for the produced bounding box.

编辑:我查看了关闭我帖子的用户所写的内容并试图对其进行改进.尽管读者需要了解我在做什么的上下文.如果您对我如何改进问题有任何想法,或者您对如何做我想做的事情有一些见解,欢迎您提供反馈!

EDIT: I looked at what the users who closed my post wrote and tried to refine it. Although the reader needs context as to what I am doing. If you have any idea on how I can make the question better or if you have some insight as to how to do what I am trying to do your feedback is welcome!

我有一个问题:

  1. 为什么我只能得到一个预测实例,但是当我查看在预测 CLS 分数中,有超过 1 个通过阈值?
  1. Why am I only getting one prediction instance, but when I look at the prediction CLS scores there are more than 1 which passes the threshold?

我相信这是产生 ROI 特征的正确方法:

I believe this is the correct way of producing the ROI features:

images = ImageList.from_tensors(lst[:1], size_divisibility=32).to("cuda")  # preprocessed input tensor
#setup config
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.SOLVER.IMS_PER_BATCH = 1
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (pnumonia)
#Just run these lines if you have the trained model im memory
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7   # set the testing threshold for this model
#build model
model = build_model(cfg)
DetectionCheckpointer(model).load("output/model_final.pth")
model.eval()#make sure its in eval mode

#run model
with torch.no_grad():
    features = model.backbone(images.tensor.float())
    proposals, _ = model.proposal_generator(images, features)
    instances = model.roi_heads._forward_box(features, proposals)

然后

pred_boxes = [x.pred_boxes for x in instances]
rois = model.roi_heads.box_pooler([features[f] for f in model.roi_heads.in_features], pred_boxes)

这应该是我的 ROI 功能.

This should be my ROI features.

让我感到非常困惑的是,我可以使用proposals和proposal_boxes及其类分数来获得该图像的前n个特征,而不是使用推理时产生的边界框.很酷,所以我尝试了以下方法:

What I am very confused about is instead of using the bounding boxes produced at inference I could use the proposals and the proposal_boxes with their class scores to get the top n features for this image. Cool so I have tried the following:

proposal_boxes = [x.proposal_boxes for x in proposals]
proposal_rois = model.roi_heads.box_pooler([features[f] for f in model.roi_heads.in_features], proposal_boxes)
#found here: https://detectron2.readthedocs.io/_modules/detectron2/modeling/roi_heads/roi_heads.html
box_features = model.roi_heads.box_head(proposal_rois)
predictions = model.roi_heads.box_predictor(box_features)
pred_instances, losses = model.roi_heads.box_predictor.inference(predictions, proposals)

我应该在哪里获取我的提案框功能及其cls 在我的预测对象中.检查这个预测对象,我看到每个框的分数:

Where I should be getting my proposal box features and its cls in my predictions object. Inspecting this predictions object I see the scores for each box:

预测对象中的 CLS 分数

(tensor([[ 0.6308, -0.4926],
         [-1.6662,  1.5430],
         [-0.2080,  0.4856],
         ...,
         [-6.9698,  6.6695],
         [-5.6361,  5.4046],
         [-4.4918,  4.3899]], device='cuda:0', grad_fn=<AddmmBackward>),

在 softmaxing 并将这些 cls 分数放入数据框中并将阈值设置为 0.6 后,我得到:

After softmaxing and placing these cls scores in a dataframe and setting a threshold of 0.6 I get:

pred_df = pd.DataFrame(predictions[0].softmax(-1).tolist())
pred_df[pred_df[0] > 0.6]
    0           1
0   0.754618    0.245382
6   0.686816    0.313184
38  0.722627    0.277373

在我的预测对象中,我得到了相同的最高分,但只有 1 个实例而不是 2 个(我设置了 cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7):

and in my predictions object I get the same top score, but only 1 instance rather than 2 (I set cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7):

预测实例:

[Instances(num_instances=1, image_height=800, image_width=800, fields=[pred_boxes: Boxes(tensor([[548.5992, 341.7193, 756.9728, 438.0507]], device='cuda:0',
        grad_fn=<IndexBackward>)), scores: tensor([0.7546], device='cuda:0', grad_fn=<IndexBackward>), pred_classes: tensor([0], device='cuda:0')])]

预测还包含张量:Nx4 或 Nx(Kx4) 边界框回归增量. 我不完全知道它们的作用和外观:

The predictions also contain Tensor: Nx4 or Nx(Kx4) bounding box regression deltas. which I don't exactly know what they do and look like:

预测对象中的边界框回归增量

tensor([[ 0.2502,  0.2461, -0.4559, -0.3304],
        [-0.1359, -0.1563, -0.2821,  0.0557],
        [ 0.7802,  0.5719, -1.0790, -1.3001],
        ...,
        [-0.8594,  0.0632,  0.2024, -0.6000],
        [-0.2020, -3.3195,  0.6745,  0.5456],
        [-0.5542,  1.1727,  1.9679, -2.3912]], device='cuda:0',
       grad_fn=<AddmmBackward>)

还有一点很奇怪,我的建议框我的预测框不同但相似:

Something else strange is that my proposal boxes and my prediction boxes are different but similar:

提案边界框

[Boxes(tensor([[532.9427, 335.8969, 761.2068, 438.8086],#this box vs the instance box
         [102.7041, 352.5067, 329.4510, 440.7240],
         [499.2719, 317.9529, 764.1958, 448.1386],
         ...,
         [ 25.2890, 379.3329,  28.6030, 429.9694],
         [127.1215, 392.6055, 328.6081, 489.0793],
         [164.5633, 275.6021, 295.0134, 462.7395]], device='cuda:0'))]

推荐答案

您就快到了.看着 preference(noreferrer.">pa> 你会看到它不是简单地对候选框的分数进行排序.首先,它应用框增量来重新调整提案框.然后它计算非最大抑制以删除非重叠框(同时还应用其他超设置,例如分数阈值).最后,它根据它们的分数对 top-k 框进行排名.这可能解释了为什么您的方法产生相同的框分数但输出框的数量及其坐标不同.

You are almost there. Looking at roi_heads.box_predictor.inference() you will see that it doesn't simply sort the scores of the box candidates. First, it applies box deltas to readjust the proposal boxes. Then it computes Non-Maximum Suppression to remove non-overlapping boxes (while also applying other hyper-settings such as score threshold). Finally, it ranks top-k boxes according to their scores. That probably explains why your method produces the same box scores but different number of output boxes and its coordinates.

回到您最初的问题,这是在一次推理中提取建议框特征的方法:

Back to your original question, here is the way to extract the features of the proposed boxes in one inference pass:

image = cv2.imread('my_image.jpg')
height, width = image.shape[:2]
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
inputs = [{"image": image, "height": height, "width": width}]
with torch.no_grad():
    images = model.preprocess_image(inputs)  # don't forget to preprocess
    features = model.backbone(images.tensor)  # set of cnn features
    proposals, _ = model.proposal_generator(images, features, None)  # RPN

    features_ = [features[f] for f in model.roi_heads.box_in_features]
    box_features = model.roi_heads.box_pooler(features_, [x.proposal_boxes for x in proposals])
    box_features = model.roi_heads.box_head(box_features)  # features of all 1k candidates
    predictions = model.roi_heads.box_predictor(box_features)
    pred_instances, pred_inds = model.roi_heads.box_predictor.inference(predictions, proposals)
    pred_instances = model.roi_heads.forward_with_given_boxes(features, pred_instances)

    # output boxes, masks, scores, etc
    pred_instances = model._postprocess(pred_instances, inputs, images.image_sizes)  # scale box to orig size
    # features of the proposed boxes
    feats = box_features[pred_inds]

这篇关于Detectron2 - 在目标检测阈值处提取区域特征的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆