咖啡 |solver.prototxt 值设置策略 [英] Caffe | solver.prototxt values setting strategy

查看:20
本文介绍了咖啡 |solver.prototxt 值设置策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Caffe 上,我正在尝试实现一个用于语义分割的全卷积网络.我想知道是否有特定的策略来为以下超参数设置 'solver.prototxt' 值:

On Caffe, I am trying to implement a Fully Convolution Network for semantic segmentation. I was wondering is there a specific strategy to set up your 'solver.prototxt' values for the following hyper-parameters:

  • test_iter
  • test_interval
  • iter_size
  • max_iter

这是否取决于训练集的图像数量?如果是这样,如何?

Does it depend on the number of images you have for your training set? If so, how?

推荐答案

为了以有意义的方式设置这些值,您需要更多关于数据的信息:

In order to set these values in a meaningful manner, you need to have a few more bits of information regarding your data:

1.训练集大小您拥有的训练样本总数,我们称这个数量为T.
<强>2.训练批次大小 单个批次中一起处理的训练示例的数量,这通常由 'train_val.prototxt' 中的输入数据层设置.例如,在这个文件中,火车批量大小设置为 256.让我们用 tb 表示这个数量.
<强>3.验证集大小 为验证模型而留出的示例总数,我们用 V 表示.
<强>4.在 batch_size 中为 TEST 阶段设置的验证批量大小值.在这个例子中设置为50. 我们称之为vb.

1. Training set size the total number of training examples you have, let's call this quantity T.
2. Training batch size the number of training examples processed together in a single batch, this is usually set by the input data layer in the 'train_val.prototxt'. For example, in this file the train batch size is set to 256. Let's denote this quantity by tb.
3. Validation set size the total number of examples you set aside for validating your model, let's denote this by V.
4. Validation batch size value set in batch_size for the TEST phase. In this example it is set to 50. Let's call this vb.

现在,在训练期间,您希望每隔一段时间就对网络的性能进行无偏估计.为此,您可以在 test_iter 迭代的验证集上运行您的网络.要覆盖整个验证集,您需要有 test_iter = V/vb.
您希望多久获得一次此估算值?这真的取决于你.如果你有一个非常大的验证集和一个缓慢的网络,验证太频繁会使训练过程太长.另一方面,验证不够频繁可能会阻止您注意您的训练过程是否以及何时未能收敛.test_interval 决定验证的频率:通常对于大型网络,您将 test_interval 设置为 5K 的数量级,对于更小和更快的网络,您可以选择较低的值.再次,一切都取决于你.

Now, during training, you would like to get an un-biased estimate of the performance of your net every once in a while. To do so you run your net on the validation set for test_iter iterations. To cover the entire validation set you need to have test_iter = V/vb.
How often would you like to get this estimation? It's really up to you. If you have a very large validation set and a slow net, validating too often will make the training process too long. On the other hand, not validating often enough may prevent you from noting if and when your training process failed to converge. test_interval determines how often you validate: usually for large nets you set test_interval in the order of 5K, for smaller and faster nets you may choose lower values. Again, all up to you.

为了覆盖整个训练集(完成一个epoch"),您需要运行 T/tb 迭代.通常一个训练几个时期,因此max_iter=#epochs*T/tb.

In order to cover the entire training set (completing an "epoch") you need to run T/tb iterations. Usually one trains for several epochs, thus max_iter=#epochs*T/tb.

关于 iter_size:这允许在多个训练小批量上平均梯度,请参阅此线程 了解更多信息.

Regarding iter_size: this allows to average gradients over several training mini batches, see this thread fro more information.

这篇关于咖啡 |solver.prototxt 值设置策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆