Caffe 可以直接对图像的像素进行分类吗? [英] Can Caffe classify pixels of an image directly?

查看:25
本文介绍了Caffe 可以直接对图像的像素进行分类吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将图像的像素分类为is street"或is not street".我有一些来自 KITTI 数据集的训练数据,我看到 Caffe 有一个 IMAGE_DATA 图层类型.标签以与输入图像大小相同的图像形式存在.

I would like to classify pixels of an image to "is street" or "is not street". I have some training data from the KITTI dataset and I have seen that Caffe has an IMAGE_DATA layer type. The labels are there in form of images of the same size as the input image.

除了 Caffe,我解决这个问题的第一个想法是在应该分类的像素周围提供图像块(例如,顶部/左侧/右侧/底部 20 个像素,导致每像素 41×41=1681 个特征)想分类.
但是,如果我可以告诉 caffe 如何使用标签而不必手动创建这些图像补丁(图层类型 IMAGE_DATA 似乎表明这是可能的),我会更喜欢.

Besides Caffe, my first idea to solve this problem was by giving image patches around the pixel which should get classified (e.g. 20 pixels to the top / left / right / bottom, resulting in 41×41=1681 features per pixel I want to classify.
However, if I could tell caffe how to use the labels without having to create those image patches manually (and the layer type IMAGE_DATA seems to suggest that it is possible) I would prefer that.

Caffe 可以直接对图像的像素进行分类吗?这样的 prototxt 网络定义会是什么样子?如何向 Caffe 提供有关标签的信息?

Can Caffe classify pixels of an image directly? How would such a prototxt network definition look like? How do I give Caffe the information about the labels?

我猜输入层应该是这样的

I guess the input layer would be something like

layers {
  name: "data"
  type: IMAGE_DATA
  top: "data"
  top: "label"
  image_data_param {
    source: "path/to/file_list.txt"
    mean_file: "path/to/imagenet_mean.binaryproto"
    batch_size: 4
    crop_size: 41
    mirror: false
    new_height: 256
    new_width: 256
  }
}

但是,我不确定 crop_size 的确切含义.真的居中吗?caffe是如何处理角点像素的?new_heightnew_width 有什么用?

However, I am not sure what crop_size exactly means. Is it really centered? How does caffe deal with the corner pixels? What is new_height and new_width good for?

推荐答案

看来可以试试 用于语义分割的全卷积网络

本文引用了Caffe:https://github.com/BVLC/caffe/wiki/出版物

Caffe was cited in this paper: https://github.com/BVLC/caffe/wiki/Publications

还有这个模型:https://github.com/BVLC/caffe/wiki/Model-Zoo#fully-convolutional-semantic-segmentation-models-fcn-xs

此外,此演示文稿可能会有所帮助:http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-pixels.pdf

Also this presentation can be helpfull: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-pixels.pdf

这篇关于Caffe 可以直接对图像的像素进行分类吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆