感知器学习算法的参数调整 [英] Parameter Tuning for Perceptron Learning Algorithm

查看:358
本文介绍了感知器学习算法的参数调整的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一个问题,试图弄清楚如何为我的感知器算法调整参数,以使其在看不见的数据上表现相对良好.

I'm having sort of an issue trying to figure out how to tune the parameters for my perceptron algorithm so that it performs relatively well on unseen data.

我已经实现了一种经过验证的工作感知器算法,我想找出一种方法,通过该方法可以调整迭代次数和感知器的学习率.这是我感兴趣的两个参数.

I've implemented a verified working perceptron algorithm and I'd like to figure out a method by which I can tune the numbers of iterations and the learning rate of the perceptron. These are the two parameters I'm interested in.

我知道感知器的学习速度不会影响算法是否收敛和完成.我正在尝试掌握如何更改n.太快了,它会摇摆很多,而太低了,它会花费更长的时间.

I know that the learning rate of the perceptron doesn't affect whether or not the algorithm converges and completes. I'm trying to grasp how to change n. Too fast and it'll swing around a lot, and too low and it'll take longer.

至于迭代次数,我不确定如何确定理想次数.

As for the number of iterations, I'm not entirely sure how to determine an ideal number.

无论如何,我们将不胜感激.谢谢.

In any case, any help would be appreciated. Thanks.

推荐答案

从少量的迭代开始(实际上,计算'epochs'而不是迭代是比较传统的做法-'epochs'是指到用于训练网络的整个数据集的迭代次数).通过小",我们可以说大约50个纪元.这样做的原因是,您希望查看总误差在每个附加训练周期(时期)中的变化情况-希望它正在下降(有关下面的总误差"的更多信息).

Start with a small number of iterations (it's actually more conventional to count 'epochs' rather than iterations--'epochs' refers to the number of iterations through the entire data set used to train the network). By 'small' let's say something like 50 epochs. The reason for this is that you want to see how the total error is changing with each additional training cycle (epoch)--hopefully it's going down (more on 'total error' below).

很显然,您对下一个额外的纪元不会导致总误差进一步减小的点(纪元数)感兴趣.因此,从少数几个纪元开始,这样您就可以通过增加纪元来达到这一点.

Obviously you are interested in the point (the number of epochs) where the next additional epoch does not cause a further decrease in total error. So begin with a small number of epochs so you can approach that point by increasing the epochs.

开始时的学习率不应太高或太粗(显然是主观的,但希望您对大的学习率与小的学习率有一个大致的认识).

The learning rate you begin with should not be too fine or too coarse, (obviously subjective but hopefully you have a rough sense for what is a large versus small learning rate).

接下来,在您的感知器中插入几行测试代码-实际上只是几条放置正确的打印"语句.对于每次迭代,计算并显示差值(训练数据中每个数据点的实际值减去预测值),然后将训练数据中所有点(数据行)上的各个差值相加(通常取误差的绝对值).增量,也可以取平方差之和的平方根–没什么大不了.将求和值称为总误差"-只需清楚一点,这就是总误差(误差的总和)所有节点)每个纪元.

Next, insert a few lines of testing code in your perceptron--really just a few well-placed 'print' statements. For each iteration, calculate and show the delta (actual value for each data point in the training data minus predicted value) then sum the individual delta values over all points (data rows) in the training data (i usually take the absolute value of the delta, or you can take the square root of the sum of the squared differences--doesn't matter too much. Call that summed value "total error"--just to be clear, this is total error (sum of the error across all nodes) per epoch.

然后,绘制总误差作为历元数的函数(即,x轴上的历元数,y轴上的总误差).当然,最初,您会看到左上角的数据点呈向下和向右趋势,并且斜率减小

Then, plot the total error as a function of epoch number (ie, epoch number on the x axis, total error on the y axis). Initially of course, you'll see the data points in the upper left-hand corner trending down and to the right and with a decreasing slope

让算法根据训练数据训练网络. 增加时期(例如每次运行增加10)直到看到曲线(总误差与时期数)展平-即,附加迭代不会减少总错误.

Let the algorithm train the network against the training data. Increase the epochs (by e.g., 10 per run) until you see the curve (total error versus epoch number) flatten--i.e., additional iterations doesn't cause a decrease in total error.

因此,该曲线的斜率很重要,其垂直位置也很重要-即,您有多少总误差,以及随着更多的训练周期(历元)它是否继续呈下降趋势.如果在增加时期之后您最终发现错误增加,请以较低的学习率重新开始.

So the slope of that curve is important and so is its vertical position--ie., how much total error you have and whether it continues to trend downward with more training cycles (epochs). If, after increasing epochs, you eventually notice an increase in error, start again with a lower learning rate.

学习率(通常在0.01到0.2之间的一个分数)肯定会影响网络的训练速度-即,它可以使您更快地移至本地最小值.它也可能导致您跳过它.因此,对一个训练网络的循环进行编码(假设是五个不同的时间),每次使用固定数量的时期(和相同的起点),但将学习率从例如0.05更改为0.2,每次将学习率提高0.05.

The learning rate (usually a fraction between about 0.01 and 0.2) will certainly affect how quickly the network is trained--i.e., it can move you to the local minimum more quickly. It can also cause you to jump over it. So code a loop that trains a network, let's say five separate times, using a fixed number of epochs (and a the same starting point) each time but varying the learning rate from e.g., 0.05 to 0.2, each time increasing the learning rate by 0.05.

这里的另一个参数很重要(尽管并非绝对必要),动量" .顾名思义,使用动量术语将帮助您更快地获得经过充分训练的网络.从本质上讲,动量是学习率的乘数,只要错误率不断降低,动量项就会加速学习.动量项的直觉是"只要您朝目的地行驶,就增加速度".动量项的典型值为0.1或0.2.在上面的训练方案中,您可能应该在改变学习率的同时保持动量恒定.

One more parameter is important here (though not strictly necessary), 'momentum'. As the name suggests, using a momentum term will help you get an adequately trained network more quickly. In essence, momentum is a multiplier to the learning rate--as long as the the error rate is decreasing, the momentum term accelerates the progress. The intuition behind the momentum term is 'as long as you traveling toward the destination, increase your velocity'.Typical values for the momentum term are 0.1 or 0.2. In the training scheme above, you should probably hold momentum constant while varying the learning rate.

这篇关于感知器学习算法的参数调整的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆