在训练神经元网络之初,训练数据与验证数据之间的损失差异有多重要? [英] how important is the loss difference between training and validation data at the beginning when training a neuronal network?

本文介绍了在训练神经元网络之初,训练数据与验证数据之间的损失差异有多重要?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简短问题: 培训开始时(第一个时期)验证与培训损失之间的差异是否很好地指示了应使用的数据量? 例如,增加数据量直到开始时的差异尽可能小是个好方法吗?这样可以节省我的时间和计算量.

Short question: Is the difference between validation and training loss at the beginning of the training (first epochs) a good indicator for the amount of data that should be used? E.g would it be a good method to increase the amount of data until the difference at the beginning is as small as possible? It would save me time and computation.

背景: 我正在研究一个过快拟合的神经元网络.在应用了许多不同的技术(例如,辍学,批量标准化,降低学习率,减小批量大小,增加数据种类,减少图层,增加过滤器大小...)之后的最佳结果仍然非常糟糕. 虽然训练损失减少得很好,但验证损失过早拟合(我指的是过早,未达到所需的损失,应该减少很多倍) 由于使用我的数据集进行的大约200个样本的训练耗时24个小时,持续了50个纪元,因此我希望找到一种方法,以在增加数据量之前使用上述所有方法来解决过度拟合问题.因为没有什么帮助我增加数据量. 我在考虑有多少数据足以使我的网络消除过度拟合.我知道这不容易回答,因为它取决于数据的复杂性和我要解决的任务.因此,我尝试将我的问题归纳为:

backround: I am working on a neuronal network that overfits very fast. The best result after applying many different techniques like dropouts, batch normalization, reducing learning rate, reducing batch size, increasing variety of data, reducing layers, increasing filter sizes ..... is still very bad. While the training loss decreases very well, validation loss overfits too early(with too early I mean, the desired loss is not reached, it should be many times less) Since the training with my dataset ~200 samples took 24 hours for 50 epochs, I was hoping to find a way to fight against overfitting with all the methods I described above, before increasing the amount of data. Because nothing helped I am at the point of increasing the amount of data. I am thinking about how much data could be enough for my network to eliminate overfitting. I know that this it is not easy to answer because it depends on the complexity of the data and the task I am trying to solve.. therefore I try to generalize my question to:

推荐答案

简短问题的简短答案:否

解释:(train_loss-val_loss)与训练模型所需的数据量之间存在相关性,但是还有很多其他因素可能是造成这种情况的主要原因(train_loss -val_loss).例如,您的网络体系结构太小,因此您的模型很快就会过拟合.或者,您的验证集不反映训练数据.否则您的学习率太大.或者...

explanation: There's a correlation between (train_loss - val_loss) and the amount of data you need to train your model, but there's a bunch of other factors that could be the source of the big (train_loss - val_loss). For example, your network architecture is too small, and therefor your model quickly overfits. Or, your validation set doesn't reflect the training data. Or your learning rate is too big. Or...

所以我的建议是:在另一个SO问题中提出您的问题,然后问我可能做错了什么?"

So my recommendation: formulate your problem in another SO question, and ask "what might I be doing wrong?"

这篇关于在训练神经元网络之初,训练数据与验证数据之间的损失差异有多重要?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆