神经网络的层和神经元 [英] Layers and Neurons of a Neural Network

查看:107
本文介绍了神经网络的层和神经元的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对神经网络有更多了解,我正在开发C ++程序来制作神经网络,但是我对BackPropagation算法感到困惑,对不起,我没有提供一些有效的代码.
我知道有太多的库可以用多种语言创建NN,但是我更喜欢自己创建一个库.关键是我不知道要达到特定目标(例如模式识别或函数逼近等)所需的层数和神经元数.
我的问题是:如果我想识别某些特定模式,例如在图像检测中,则需要多少层和每层神经元?假设我的图像都是8x8像素,我自然会从64个神经元的输入层开始,但是我不知道在隐藏层以及输出层中必须放置多少个神经元.假设我必须与猫狗区别开来,或者无论您怎么想,输出层如何?我可以想象一个输出层,其中只有一个神经元使用经典的逻辑函数(1/(1+exp(-x))输出0到1之间的值,当它接近0时,它是一只猫,当接近1时,它是一条狗,但是. .. 这是正确的吗?如果我添加了一条像鱼一样的新图案怎么办?如果输入内容包含一只狗和一只猫(..和一条鱼)怎么办?这使我认为输出层中的逻辑函数不太适合像这样的模式识别,只是因为1/(1 + exp(-x))的范围是(0,1).我是否必须更改激活功能,或者可能在输出层中添加其他一些神经元?还有其他一些激活功能可以更准确地做到这一点吗?每一层中的每个神经元具有相同的激活功能,还是每一层都不相同?

I would like to know a bit more about Neural Network, I'm developing a C++ program to make a NN but I'm stuck with the BackPropagation algorithm, sorry for not offering some working code.
I know that there are so many libraries for creating a NN in many languages, but I prefer to make one from my self. The point is that I don't know how many layers and how many neurons should be necessary for achieving a particular goal such as pattern recognition, or functions approximations, or whatever.
My questions are: if I'd like to recognize some particulars patterns, like in image detection, how many layers and neurons-per-layer should be necessary? Let's say my images are all 8x8 pixels, I would start naturally with an input layer of 64 neurons, but I don't have any idea of how many neurons I have to put in hidden layers, and also in output layer. Let's say I have to distinguish from cats and dogs, or whatever you may think, how could be the output layer? I can imagine an output layer with only-one neuron outputting a value between 0 and 1 with the classical logistic function (1/(1+exp(-x)) and when it is near 0 the input was a cat and when approaches 1 it was a dog, but ... is it correct? What if I add a new pattern like a fish? and what if the input contains a dog and a cat ( ..and a fish)? This make me thinking that the logistic function in the output layer is not very suitable for pattern recognition like this, only because 1/(1+exp(-x)) has a range in (0,1). Do I have to change the activation function or maybe add some other neurons to the output layer? Are there some other activations function more accurate to do this? Do every neurons in every layers have the same activation function, or it is different from layer to layer?

很抱歉,所有这些问题对我来说都不是很清楚. 我在互联网上阅读了很多书,发现图书馆尚未实现并且很难阅读,并且对NN可以做什么而不是怎么做做了很多解释.
我从 https://mattmazur阅读了很多东西. com/2015/03/17/a-step-by-step-backpropagation-example/> ://neuralnetworksanddeeplearning.com/chap1.html ,在这里,我了解了如何近似函数(因为可以将层中的每个神经元视为具有权重和偏差特定步长的阶跃函数)以及如何反向传播算法可以工作,但是其他教程和类似文章更侧重于预先存在的库.我还阅读了这个问题确定适当的神经元数量神经网络,但我也想介绍一下神经网络的激活功能,这是最好的,也是最好的.

Sorry for all of this questions, but this topic is not very clear to me. I read a lot around internet, and I found libraries all-yet-implemented and hard to read from, and many explanations to what a NN can do, but not how it can do.
I read a lot from https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/ and http://neuralnetworksanddeeplearning.com/chap1.html, and here I understood how to approximate a function (because every neurons in a layer can be thought as a step-function with a particular step for weights and bias) and how back-propagation algorithm works, but other tutorials and similars were more focused on preexisting libraries. I also read this question Determining the proper amount of Neurons for a Neural Network but I would like to involve also the activation functions of a NN, which is the best and for what is the best.

提前感谢您的回答!

推荐答案

@Frank Puffer已经提供了一些不错的信息,但让我加两分钱.首先,您要问的大部分内容是超参数优化.尽管存在各种经验法则",但现实是确定最佳架构(层数/大小,连接结构等)和其他参数(如学习率)通常需要进行大量实验.好消息是,这些超参数的参数化是神经网络实现中最简单的方面之一.因此,我建议您专注于构建软件,以便可以轻松配置层数,层大小,学习率等.

@Frank Puffer has already provided some nice information, but let me add my two cents. First off, much of what you're asking is in the area of hyperparameter optimization. Although there are various "rules of thumb", the reality is that determining the optimal architecture (number/size of layers, connectivity structure, etc.) and other parameters like the learning rate typically requires extensive experimentation. The good news is that the parameterization of these hyperparameters is among the simplest aspects of the implementation of a neural network. So I would recommend focusing on building your software such that the number of layers, size of layers, learning rate, etc., are all easily configurable.

现在,您特别询问了有关检测图像中的图案的问题.值得一提的是,使用标准的多层感知器(MLP)对原始图像数据进行分类在计算上可能会非常昂贵,尤其是对于较大的图像.通常会使用旨在提取有用的局部空间特征的架构(例如:卷积神经网络或CNN).

Now you specifically asked about detecting patterns in an image. It's worth mentioning that using standard multi-layer perceptrons (MLPs) to perform classification on raw image data can be computationally expensive, especially for larger images. It's common to use architectures that are designed to extract useful, spacially-local features (i.e.: Convolutional Neural Networks or CNNs).

您仍然可以为此使用标准MLP,但是计算复杂性可能使其成为难以解决的解决方案.例如,CNN的稀疏连接性极大地减少了需要优化的参数数量,同时建立了更适合图像分类的表示形式的概念层次.

You could still use standard MLPs for this, but the computational complexity can make it an untenable solution. The sparse connectivity of CNNs for example dramatically reduce the number of parameters requiring optimization and simultaneously build a conceptual hierarchy of representations better suited for classification of images.

无论如何,我建议您使用随机梯度下降来实现反向传播以进行优化.仍然是通常用于训练神经网络,CNN,RNN等的方法.

Regardless, I would recommend implementing backpropagation using stochastic gradient descent for optimization. This is still the approach typically used for training neural nets, CNNs, RNNs, etc.

关于输出神经元的数量,这是一个确实有一个简单答案的问题:使用单热"编码.对于要识别的每个类,您都有一个输出神经元.在狗,猫和鱼类别的示例中,您具有三个神经元.对于代表狗的输入图像,您期望狗"神经元的值为1,而其他所有神经元的值为0.然后,在推理过程中,您可以将输出解释为反映NN的置信度的概率分布.例如,如果您获得输出dog:0.70,cat:0.25,fish:0.05,那么您就可以确定该图像是一只狗,这有70%的置信度,依此类推.

Regarding the number of output neurons, this is one question that does have a simple answer: use "one-hot" encoding. For each class you want to recognize, you have an output neuron. In your example of the dog, cat, and fish classes, you have three neurons. For an input image representing a dog, you would expect a value of 1 for the "dog" neuron, and 0 for all the others. Then, during inference, you can interpret the output as a probability distribution reflecting the confidence of the NN. For example, if you get output dog:0.70, cat:0.25, fish:0.05, then you have a 70% confidence that the image is a dog, and so on.

对于激活功能,我所看到的最新研究似乎表明整流线性单元通常是一个不错的选择,因为它们易于区分和计算,并且避免了困扰更深层网络的问题,即消失梯度问题".

For activation functions, the most recent research I've seen seems to indicate that Rectified Linear Units are generally a good choice since they're easy to differentiate and compute, and they avoid a problem that plagues deeper networks called the "vanishing gradient problem".

祝你好运!

这篇关于神经网络的层和神经元的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆