TensorFlow 对象检测 API 的奇怪行为 [英] TensorFlow Object Detection API Weird Behavior

查看:33
本文介绍了TensorFlow 对象检测 API 的奇怪行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 TensorFlow 的全新未能检测到顶级货架中的骆驼,而在其他图像中检测到产品

为什么检测不到顶行的万宝路?

我遇到的另一个问题是模型从未检测到除了标签 1 之外的任何其他标签

未从训练数据中检测到产品的裁剪实例

即使在负面图像中,它也能以 99% 的置信度检测香烟盒!

有人可以帮我解决问题吗?我可以做些什么来提高准确性?为什么它检测到所有产品都属于类别 1,即使我已经提到总共有 11 个类别?

编辑添加了我的标签地图:

item {编号:1名称:'1'}物品 {编号:2名称:'2'}物品 {编号:3名称:'3'}物品 {编号:4名称:'4'}物品 {编号:5名称:'5'}物品 {编号:6名称:'6'}物品 {编号:7名称:'7'}物品 {编号:8名称:'8'}物品 {编号:9名称:'9'}物品 {编号:10名称:'10'}物品 {编号:11名称:'11'}

解决方案

所以我想我知道发生了什么.我对数据集做了一些分析,发现它偏向于类别 1 的对象.

这是每个类别从 1 到 11 的频率分布(在基于 0 的索引中)

0 104401 3042 9983 674 4125 1146 1907 3118 1959 7810 75

我猜模型达到了局部最小值,只需将所有内容标记为类别 1 就足够了.

关于一些盒子没有检测到的问题:我再次尝试训练,但这次我没有区分品牌.相反,我试图教模型烟盒是什么.它仍然没有检测到所有的盒子.

然后我决定裁剪输入图像并将其作为输入提供.只是为了看看结果是否有所改善,确实如此!

事实证明,输入图像的尺寸远大于模型接受的 600 x 1024.因此,它将这些图像缩小到 600 x 1024,这意味着烟盒丢失了它们的细节:)

所以,我决定测试原始模型,该模型在裁剪图像的所有类上训练过,它的工作原理非常棒:)

这是模型在原始图像上的输出

当我裁剪出左上角的四分之一并将其作为输入提供时,这是模型的输出.

感谢所有帮助过的人!祝贺 TensorFlow 团队为 API 做出了出色的工作:) 现在每个人都可以训练对象检测模型!

I was playing with TensorFlow's brand new Object Detection API and decided to train it on some other publicly available datasets.

I happened to stumble upon this grocery dataset which consists of images of various brands of cigarette boxes on the supermarket shelf along with a text file which lists out the bounding boxes of each cigarette box in each image. 10 major brands have been labeled in the dataset and all other brands fall into the 11th "miscellaneous" category.

I followed their tutorial and managed to train the model on this dataset. Due to limitations on processing power, I used only a third of the dataset and performed a 70:30 split for training and testing data. I used the faster_rcnn_resnet101 model. All parameters in my config file are the same as the default parameters provided by TF.

After 16491 global steps, I tested the model on some images but I am not too happy with the results -

Failed to detect the Camels in top-shelf whereas it detects the product in other images

Why does it fail to detect the Marlboros in the top row?

Another issue I had is that the model never detected any other label except for label 1

Doesn't detected a crop instance of the product from the training data

It detects cigarette boxes with 99% confidence even in negative images!

Can somebody help me with what is going wrong? What can I do to improve the accuracy? And why does it detect all products to belong in category 1 even though I have mentioned that there are 11 classes in total?

Edit Added my label map:

item {
  id: 1
  name: '1'
}

item {
  id: 2
  name: '2'
}

item {
  id: 3
  name: '3'
}

item {
  id: 4
  name: '4'
}

item {
  id: 5
  name: '5'
}

item {
  id: 6
  name: '6'
}

item {
  id: 7
  name: '7'
}

item {
  id: 8
  name: '8'
}

item {
  id: 9
  name: '9'
}

item {
  id: 10
  name: '10'
}

item {
  id: 11
  name: '11'
}

解决方案

So I think I figured out what is going on. I did some analysis on the dataset and found out that it is skewed towards objects of category 1.

This is the frequency distribution of the each category from 1 to 11 (in 0 based indexing)

0 10440
1 304
2 998
3 67
4 412
5 114
6 190
7 311
8 195
9 78
10 75

I guess the model is hitting a local minima where just labelling everything as category 1 is good enough.

About the problem of not detecting some boxes : I tried training again, but this time I didn't differentiate between brands. Instead, I tried to teach the model what a cigarette box is. It still wasn't detecting all the boxes.

Then I decided to crop the input image and provide that as an input. Just to see if the results improve and it did!

It turns out that the dimensions of the input image were much larger than the 600 x 1024 that is accepted by the model. So, it was scaling down these images to 600 x 1024 which meant that the cigarette boxes were losing their details :)

So, I decided to test the original model which was trained on all classes on cropped images and it works like a charm :)

This was the output of the model on the original image

This is the output of the model when I crop out the top left quarter and provide it as input.

Thanks everyone who helped! And congrats to the TensorFlow team for an amazing job for the API :) Now everybody can train object detection models!

这篇关于TensorFlow 对象检测 API 的奇怪行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆