人工神经网络RELU激活函数和梯度 [英] Artificial Neural Network RELU Activation Function and Gradients

本文介绍了人工神经网络RELU激活函数和梯度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个问题.我观看了有关在C ++中实现人工神经网络的非常详细的教程.现在,我对神经网络的工作原理以及如何进行编程和训练有了基本的了解.

I have a question. I watched a really detailed tutorial on implementing an artificial neural network in C++. And now I have more than a basic understanding of how a neural network works and how to actually program and train one.

因此,在本教程中,使用双曲线正切来计算输出,并且显然使用它的导数来计算梯度.但是我想转到其他功能.特别是泄漏的RELU(以避免死亡的神经元).

So in the tutorial a hyperbolic tangent was used for calculating outputs, and obviously its derivative for calculating gradients. However I wanted to move on to a different function. Specifically Leaky RELU (to avoid dying neurons).

我的问题是,它指定此激活功能应仅用于隐藏层.对于输出层,应使用其他函数(softmax或线性回归函数).在教程中,那个家伙教过神经网络是XOR处理器.那么这是分类问题还是回归问题?

My question is, it specifies that this activation function should be used for the hidden layers only. For the output layers a different function should be used (either a softmax or a linear regression function). In the tutorial the guy taught the neural network to be an XOR processor. So is this a classification problem or a regression problem?

我试图用Google搜索两者之间的区别,但是我不太了解XOR处理器的类别.是分类问题还是回归问题? 因此,我实现了Leaky RELU函数及其派生函数,但是我不知道是否应该对输出层使用softmax或回归函数.

I tried to google the difference between the two, but I can't quite grasp the category for the XOR processor. Is it a classification or a regression problem? So I implemented the Leaky RELU function and its derivative but I don't know whether I should use a softmax or a regression function for the output layer.

我也使用Leaky RELU的导数(现在)来重新计算输出梯度,但是在这种情况下,我也应该使用softmax的/回归导数吗?

Also for recalculating the output gradients I use the Leaky RELU's derivative(for now) but in this case should I use the softmax's/regression derivative as well?

谢谢.

推荐答案

我试图用Google搜索两者之间的区别,但是我不太了解XOR处理器的类别.是分类问题还是回归问题?

I tried to google the difference between the two, but I can't quite grasp the category for the XOR processor. Is it a classification or a regression problem?

简而言之,分类是针对离散目标,回归是针对连续目标.如果它是浮点运算,那么您将遇到回归问题.但是这里XOR的结果是01,所以它是一个二进制分类(已经由Sid建议).您应该使用softmax图层(或 Sigmoid函数,它特别适用于2个类).请注意,输出将是概率的向量,即实数值,用于选择离散目标类别.

In short, classification is for discrete target, regression is for continuous target. If it were a floating point operation, you had a regression problem. But here the result of XOR is 0 or 1, so it's a binary classification (already suggested by Sid). You should use a softmax layer (or a sigmoid function, which works particularly for 2 classes). Note that the output will be a vector of probabilities, i.e. real valued, which is used to choose the discrete target class.

我也使用Leaky RELU的导数(现在)来重新计算输出梯度,但是在这种情况下,我也应该使用softmax的/回归导数吗?

Also for recalculating the output gradients I use the Leaky RELU's derivative(for now) but in this case should I use the softmax's/regression derivative as well?

正确.对于输出层,您需要一个交叉熵损失函数,该函数对应softmax层,它是向后传递的导数. 如果存在仍然使用Leaky ReLu的隐藏图层,则对于这些特定图层,您还需要相应地使用Leaky ReLu的派生类.

Correct. For the output layer you'll need a cross-entropy loss function, which corresponds to the softmax layer, and it's derivative for the backward pass. If there will be hidden layers that still use Leaky ReLu, you'll also need Leaky ReLu's derivative accordingly, for these particular layers.

强烈推荐关于反向传播详细信息的帖子.

这篇关于人工神经网络RELU激活函数和梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆