"BatchNorm"应如何设置?层可用于咖啡中吗? [英] How should "BatchNorm" layer be used in caffe?

查看:66
本文介绍了"BatchNorm"应如何设置?层可用于咖啡中吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对应该如何在模型中使用/插入 "BatchNorm" 层感到有些困惑.
我看到几种不同的方法,例如:

I am a little confused about how should I use/insert "BatchNorm" layer in my models.
I see several different approaches, for instance:

"BatchNorm"层后紧跟"Scale"层:

layer {
    bottom: "res2a_branch1"
    top: "res2a_branch1"
    name: "bn2a_branch1"
    type: "BatchNorm"
    batch_norm_param {
        use_global_stats: true
    }
}

layer {
    bottom: "res2a_branch1"
    top: "res2a_branch1"
    name: "scale2a_branch1"
    type: "Scale"
    scale_param {
        bias_term: true
    }
}

cifar10示例:仅"BatchNorm"

在caffe附带的cifar10示例中,使用"BatchNorm"且后面没有任何"Scale":

cifar10 example: only "BatchNorm"

In the cifar10 example provided with caffe, "BatchNorm" is used without any "Scale" following it:

layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
}

cifar10 不同的batch_norm_param TRAINTEST

batch_norm_param: use_global_scaleTRAINTEST阶段之间更改:

cifar10 Different batch_norm_param for TRAIN and TEST

batch_norm_param: use_global_scale is changed between TRAIN and TEST phase:

layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: false
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn1"
  batch_norm_param {
    use_global_stats: true
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  include {
    phase: TEST
  }
}

那应该是什么?

如何在caffe中使用"BatchNorm"层?

推荐答案

如果您遵循原始论文,则批处理规范化后应跟随Scale和Bias图层(可以通过Scale包含偏差,尽管这会使Bias参数无法访问). use_global_stats也应从训练(False)更改为测试/部署(True)-这是默认行为.请注意,您提供的第一个示例是用于部署的原型,因此将其设置为True是正确的.

If you follow the original paper, the Batch normalization should be followed by Scale and Bias layers (the bias can be included via the Scale, although this makes the Bias parameters inaccessible). use_global_stats should also be changed from training (False) to testing/deployment (True) - which is the default behavior. Note that the first example you give is a prototxt for deployment, so it is correct for it to be set to True.

我不确定共享参数.

我提出了一个拉动请求,以改进批标准化中的文档,但是由于要修改它而关闭了它.然后,我再也回不来了.

I made a pull request to improve the documents on the batch normalization, but then closed it because I wanted to modify it. And then, I never got back to it.

请注意,尽管我现在找不到对应的PR,但我认为不再需要"BatchNorm"lr_mult: 0(也许不被允许?).

Note that I think lr_mult: 0 for "BatchNorm" is no longer required (perhaps not allowed?), although I'm not finding the corresponding PR now.

这篇关于"BatchNorm"应如何设置?层可用于咖啡中吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆