是否可以在caffe中使用任意图像大小? [英] Is it possible to use arbitrary image sizes in caffe?

查看:352
本文介绍了是否可以在caffe中使用任意图像大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道caffe具有所谓的空间金字塔层,它使网络可以使用任意图像大小.我的问题是,网络似乎拒绝在单个批处理中使用任意图像大小.我会错过什么还是真正的问题?.

I know that caffe has the so called spatial pyramid layer, which enables networks to use arbitrary image sizes. The problem I have is, that the network seems to refuse, to use arbitrary image sizes within a single batch. Do I miss something or is this the real problem?.

我的train_val.prototxt:

My train_val.prototxt:

name: "digits"
layer {
  name: "input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "/Users/rvaldez/Documents/Datasets/Digits/SeperatedProviderV3_1020_batchnormalizedV2AndSPP/1/caffe/train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "/Users/rvaldez/Documents/Datasets/Digits/SeperatedProviderV3_1020_batchnormalizedV2AndSPP/1/caffe/test_lmdb"
    batch_size: 10
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: false
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: true
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TEST
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "bn1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "spatial_pyramid_pooling"
  type: "SPP"
  bottom: "conv2"
  top: "pool2"
  spp_param {
    pyramid_height: 2
  }
} 
layer {
  name: "bn2"
  type: "BatchNorm"
  bottom: "pool2"
  top: "bn2"
  batch_norm_param {
    use_global_stats: false
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "bn2"
  type: "BatchNorm"
  bottom: "pool2"
  top: "bn2"
  batch_norm_param {
    use_global_stats: true
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TEST
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "bn2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}



Link to another question regarding a subsequent problem.

推荐答案

您在这里混合了几个概念.

You are mixing several concepts here.

net 可以接受任意输入形状吗?
好吧,并非所有的网都可以使用任何输入形状.在许多情况下,网仅限于经过训练的输入形状.
在大多数情况下,当使用完全连接的层("InnerProduct")时,这些层期望输入 exact ,因此更改输入形状会破坏"这些层并将网限制为具体的,预定义的输入形状.
另一方面,全卷积网络"在输入形状方面更为灵活,通常可以处理任何输入形状.

Can a net accept arbitrary input shapes?
Well, not all nets can work with any input shape. In many cases a net is restricted to the input shape for which it was trained.
In most cases, when using fully-connected layers ("InnerProduct"), these layers expects an exact input dimension, thus changing the input shape "breaks" these layers and restrict the net to a specific, pre-defined input shape.
On the other hand "fully convolutional nets" are more flexible with regard to input shape and can usually process any input shape.

在批量训练过程中可以改变输入形状吗?
即使您的网络体系结构允许任意输入形状,也不能在 batch 训练期间使用所需的任何形状,因为单个批次中所有样本的输入形状都必须相同:如何串联27x27另一个形状为17x17的图片?

Can one change input shape during batch training?
Even if your net architecture allows for arbitrary input shape, you cannot use whatever shape you want during batch training because the input shape of all samples in a single batch must be the same: How can you concatenate a 27x27 image with another of shape 17x17?

您似乎遇到的错误来自"Data"层,该层正努力将不同形状的样本合并为一个批次.

It seems like the error you are getting is from the "Data" layer that is struggling with concatenating samples of different shapes into a single batch.

您可以通过设置batch_size: 1一次处理一个样本并在您的计算机中设置 iter_size: 32 来解决此问题. solver.prototxt求平均32个样本的梯度,得到batch_size: 32的SGD效果.

You can resolve this issue by setting batch_size: 1 processing one sample at a time and set iter_size: 32 in your solver.prototxt to average the gradients over 32 samples getting the SGD effect of batch_size: 32.

这篇关于是否可以在caffe中使用任意图像大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆