为什么使用带有单个隐藏节点的 IRIS 数据集可以获得良好的准确性? [英] Why do I get good accuracy with IRIS dataset with a single hidden node?

查看:13
本文介绍了为什么使用带有单个隐藏节点的 IRIS 数据集可以获得良好的准确性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有反向传播训练器的神经网络的最小示例,在 IRIS 数据集上对其进行测试.我从 7 个隐藏节点开始,效果很好.

I have a minimal example of a neural network with a back-propagation trainer, testing it on the IRIS data set. I started of with 7 hidden nodes and it worked well.

我将隐藏层中的节点数降低到 1(预计会失败),但惊讶地发现准确度上升了.

I lowered the number of nodes in the hidden layer to 1 (expecting it to fail), but was surprised to see that the accuracy went up.

我在 azure ml 中设置了实验,只是为了验证它不是我的代码.同样的事情,单个隐藏节点的准确率为 98.3333%.

I set up the experiment in azure ml, just to validate that it wasn't my code. Same thing there, 98.3333% accuracy with a single hidden node.

谁能向我解释一下这里发生了什么?

Can anyone explain to me what is happening here?

推荐答案

首先,众所周知,各种分类模型对 Iris 产生了难以置信的好结果(Iris 是非常可预测的);例如,请参见此处.

First, it has been well established that a variety of classification models yield incredibly good results on Iris (Iris is very predictable); see here, for example.

其次,我们可以观察到 Iris 数据集中的特征相对较少.此外,如果您查看数据集描述您可以看到其中两个特征与课堂结果高度相关.

Secondly, we can observe that there are relatively few features in the Iris dataset. Moreover, if you look at the dataset description you can see that two of the features are very highly correlated with the class outcomes.

这些相关值是线性的、单特征的相关性,这表明最有可能应用线性模型并观察到良好的结果.神经网络是高度非线性的;随着隐藏节点和隐藏层数量的增加,它们变得越来越复杂并捕获越来越多的非线性特征组合.

These correlation values are linear, single-feature correlations, which indicates that one can most likely apply a linear model and observe good results. Neural nets are highly nonlinear; they become more and more complex and capture greater and greater nonlinear feature combinations as the number of hidden nodes and hidden layers is increased.

考虑到这些事实,(a) 开始时特征很少,(b) 与类别具有高度线性相关性,所有这些都将指向一个不太复杂的线性函数作为合适的预测模型-- 通过使用单个隐藏节点,您几乎可以使用线性模型.

Taking these facts into account, that (a) there are few features to begin with and (b) that there are high linear correlations with class, would all point to a less complex, linear function as being the appropriate predictive model-- by using a single hidden node, you are very nearly using a linear model.

还可以注意到,在没有任何隐藏层(即只有输入和输出节点)的情况下,当使用逻辑传递函数时,这相当于逻辑回归.

It can also be noted that, in the absence of any hidden layer (i.e., just input and output nodes), and when the logistic transfer function is used, this is equivalent to logistic regression.

这篇关于为什么使用带有单个隐藏节点的 IRIS 数据集可以获得良好的准确性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆