如何避免过度拟合(Encog3 C#)? [英] How to avoid overfitting (Encog3 C#)?

查看:108
本文介绍了如何避免过度拟合(Encog3 C#)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是神经网络的新手,并且正在使用Encog3.我创建了可以进行训练和测试的前馈神经网络. 问题是我不确定如何防止过度拟合.我知道我必须将数据分为训练,测试和评估集,但是我不确定在何时何地使用评估集. 目前,我将所有数据分为训练和测试集(50%,50%),一方面训练网络,另一方面进行测试.准确度是85%. 我尝试使用CrossValidationKFold,但在这种情况下,准确性仅为12%,我不明白为什么.

I am new to neural network and I'm working with Encog3. I have created feedforward neural network which can be train and tested. Problem is that I'm not sure how to prevent overfitting. I know I have to split data into training, testing and evaluation set, but I'm not sure where and when to use evaluation set. Currently, I split all data into training and testing set (50%, 50%), train network on one part, test on another. Accuracy is 85%. I tried with CrossValidationKFold but in that case accuracy is only 12% and I don't understand why.

我的问题是,如何使用评估集来避免过度拟合? 我对评估集感到困惑,我们将不胜感激.

My question is, how can I use evaluation set to avoid overfitting? I am confused about evaluation set and any help would be appreciated.

推荐答案

通常的做法是拆分60x20x20(另一个常见用法是80x10x10)%. 60%用于培训. 20%用于验证,另外20%用于验证前两个.为什么要分为三个部分?因为它将使您更好地了解ML如何处理以前从未见过的数据.分析的另一部分可以包括代表性学习集.如果您的训练数据集中的值在验证中没有任何表示形式,那么很可能会在ML中出错.这与您的大脑运作方式相同.如果您学习了一些规则,然后突然得到了一些实际上是您会知道的规则例外的任务,那么很可能您会给出​​错误的答案.如果您在学习方面遇到问题,可以执行以下操作:增加数据集,增加输入数量(通过对输入进行一些非线性转换).也许您还需要应用一些异常检测算法.您也可以考虑应用一些不同的规范化技术.

It is general practice to have split 60x20x20 ( another common usage is 80x10x10 )%. 60 percent for training. 20 percent for validating and another 20 percent for validating previous two. Why three parts? Because it will give you better picture how ML works on data which it never seen before. Another part of analysis could include representative learning set. If you have in your training data set values which do not have any representation in validating then most probably you'll get mistakes in your ML. It's the same way how your brain works. If you learn some rules, and then suddenly got some task which is actually exception from rules you'll know, most probably you'll give wrong answer. In case if you have problems with learning, you can do the following: increase dataset, increase number of inputs ( via some non linear transformations with your inputs ). Maybe you'll also need to apply some anomaly detection algorithm. Also you can consider to apply some different normalization techniques.

这篇关于如何避免过度拟合(Encog3 C#)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆