来自Caffe | solver.prototxt值设置策略 [英] Caffe | solver.prototxt values setting strategy

查看:117
本文介绍了来自Caffe | solver.prototxt值设置策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在来自Caffe,我想实现一个完全卷积网络的语义分割.我想知道有没有设置你的值以下的超参数的具体策略:

On Caffe, I am trying to implement a Fully Convolution Network for semantic segmentation. I was wondering is there a specific strategy to set up your 'solver.prototxt' values for the following hyper-parameters:

  • test_iter
  • test_interval
  • iter_size
  • max_iter

这是否取决于您训练集所拥有的图像数量?如果是这样,怎么办?

Does it depend on the number of images you have for your training set? If so, how?

推荐答案

为了以有意义的方式来设置这些值,则需要对您的数据的信息的几个位:

In order to set these values in a meaningful manner, you need to have a few more bits of information regarding your data:

1.训练集大小,您拥有的训练样例总数,我们称此数量为T.
2.培训批次大小:在单个批次中一起处理的培训示例的数量,通常由'train_val.prototxt'中的输入数据层设置.例如,在此文件火车批量大小被设置为256.让我们分别表示该数量由.结果 第3.验证集大小的您预留验证模型的实例总数,让我们表示这由<4>.点击 的 4.验证批次大小值集合的测试阶段.在这个例子它被设置为50 .我们称之为<6>.

1. Training set size the total number of training examples you have, let's call this quantity T.
2. Training batch size the number of training examples processed together in a single batch, this is usually set by the input data layer in the 'train_val.prototxt'. For example, in this file the train batch size is set to 256. Let's denote this quantity by tb.
3. Validation set size the total number of examples you set aside for validating your model, let's denote this by V.
4. Validation batch size value set in batch_size for the TEST phase. In this example it is set to 50. Let's call this vb.

现在,在训练中,你希望得到您的净值表现每一个的未偏置估计过一段时间.要做到这一点,你对反复验证集运行网.覆盖整个验证设置你需要有<8>.点击 你会如何往往喜欢得到这个估计?这真的取决于你.如果你有一个非常大的验证集和缓慢的网络,验证过于频繁会使训练过程太长.在另一方面,没有验证往往不够可能会阻止您注意,如果当你的训练过程中没有收敛. 确定您验证多久:通常为大网将在5K的顺序,对于较小的和更快的网可以选择较低的值.再次,一切取决于你.

Now, during training, you would like to get an un-biased estimate of the performance of your net every once in a while. To do so you run your net on the validation set for test_iter iterations. To cover the entire validation set you need to have test_iter = V/vb.
How often would you like to get this estimation? It's really up to you. If you have a very large validation set and a slow net, validating too often will make the training process too long. On the other hand, not validating often enough may prevent you from noting if and when your training process failed to converge. test_interval determines how often you validate: usually for large nets you set test_interval in the order of 5K, for smaller and faster nets you may choose lower values. Again, all up to you.

在为了覆盖整个训练集(完成一个历元")需要运行迭代.通常一个列车几个信号出现时间,从而

In order to cover the entire training set (completing an "epoch") you need to run T/tb iterations. Usually one trains for several epochs, thus max_iter=#epochs*T/tb.

关于:这允许平均梯度在几个训练迷你批次,请参见这个线程来回更多的信息.

Regarding iter_size: this allows to average gradients over several training mini batches, see this thread fro more information.

这篇关于来自Caffe | solver.prototxt值设置策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆