训练卷积神经网络开始时的高训练误差 [英] High training error at the beginning of training a Convolutional neural network

本文介绍了训练卷积神经网络开始时的高训练误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在卷积神经网络中, 我正在训练CNN,在训练过程中,尤其是在训练开始时,我得到了非常高的训练错误.此后,此错误开始缓慢下降.在大约500个Epoch之后,训练误差接近于零(例如0.006604).然后,我采用最终获得的模型来根据测试数据衡量其准确性,我得到了大约89.50%. 这看起来正常吗?我的意思是在培训过程的一开始就获得很高的培训错误率. 我想提到的另一件事是,我注意到,每当我减少隐藏节点的数量时,在训练结束时结果就会变得更好.

In the Convolutional neural network, I'm working on training a CNN, and during the training process, especially at the beginning of my training I get extremely high training error. After that, this error starts to go down slowly. After approximately 500 Epochs the training error comes near to zero (e.g. 0.006604). Then, I took the final obtained model to measure its accuracy against the testing data, I've got about 89.50%. Does that seem normal? I mean getting a high training error rate at the very beginning of my training process. Another thing I'd like to mention is that I've noticed that every time I decrease the number of hidden nodes, the results become better at the end of my training.

我的CNN结构是:

 config.forward_pass_scheme = {'conv_v', 'pool', 'conv_v', 'pool', 'conv_v', 'pool', 'conv_v','full', 'full', 'full', 'out'};

以下是我的一些超级参数:

Here are some of my hyper parameters:

  config.learning_rate = 0.01;
  config.weight_range = 2;
  config.decay = 0.0005;
  config.normalize_init_weights = 1;
  config.dropout_full_layer = 1;
  config.optimization = 'adagrad';

非常感谢您在这方面的帮助和建议,在此先感谢您.

Your help and suggestion in this regard is highly appreciated, thank you in advance.

推荐答案

如果完全连接(fc)层中有大量隐藏单位,并且没有足够的训练数据,则网络将过度适合训练集.卷积层的参数较少,因此不太容易过度拟合.减少fc层中的隐藏单元数可以减少过度拟合.为了调整这些超参数(例如fc层中隐藏节点的数量),使用了验证集,以便该模型在测试集上具有良好的性能.尽管辍学有助于减少fc层的过度拟合,但如果添加太多隐藏单元,这可能还不够.

If you have a large number of hidden units in fully connected (fc) layers and do not have sufficient training data, the network will overfit to the training set. Convolutional layers are less prone to overfitting as they have less parameters. Reducing the number of hidden units in the fc layers can reduce overfitting. To tune these hyper-parameters (like number of hidden nodes in a fc layer), a validation set is used, so that the model gives good performance on the test set. Although dropout helps in reducing overfitting in fc layers, it may not be sufficient if you add too many hidden units.

是的,一开始,预计训练错误会很高. CNN使用随机优化进行训练,因此需要花费一些时间来学习参数.

Yes, in the beginning, it is expected that training error would be high. CNNs are trained using stochastic optimization, so it takes some time to learn the parameters.

这篇关于训练卷积神经网络开始时的高训练误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆