h2o深度学习权重和归一化 [英] h2o deep learning weights and normalization
问题描述
我正在通过R界面探索水,并且我得到了一个奇怪的权重矩阵.我的任务非常简单:给定x,y,计算x + y.
我有214行3列.第一列(x)从(-1000,1000)均匀绘制,第二列(y)从(-100,100)均匀绘制.我只想将它们结合起来,所以我只有一个隐藏层和一个神经元.
这是我的代码:
I'm exploring h2o via the R interface and I'm getting a weird weight matrix. My task is as simple as they get: given x,y compute x+y.
I have 214 rows with 3 columns. The first column(x) was drawn uniformly from (-1000, 1000) and the second one(y) from (-100,100). I just want to combine them so I have a single hidden layer with a single neuron.
This is my code:
library(h2o)
localH2O = h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
train <- h2o.importFile(path = "/home/martin/projects/R NN Addition/addition.csv")
model <- h2o.deeplearning(1:2,3,train, hidden = c(1), epochs=200, export_weights_and_biases=T, nfolds=5)
print(h2o.weights(model,1))
print(h2o.weights(model,2))
结果是
> print(h2o.weights(model,1))
x y
1 0.5586579 0.05518193
[1 row x 2 columns]
> print(h2o.weights(model,2))
C1
1 1.802469
由于某些原因,y的权重值比x的权重值低0.055到10倍.因此,最后,神经网络将计算出x + y/10.但是,h2o.predict实际上返回正确的值(即使在测试集上也是如此).
我猜想有一个预处理步骤可以以某种方式缩放我的数据.有什么方法可以重现模型产生的实际权重?我希望能够可视化一些非常简单的神经网络.
For some reason the weight value for y is 0.055 - 10 times lower than for x. So, in the end the neural net would compute x+y/10. However, h2o.predict actually returns the correct values (even on a test set).
I'm guessing there's a preprocessing step that's somehow scaling my data. Is there any way I can reproduce the actual weights produced by the model? I would like to be able to visualize some pretty simple neural networks.
推荐答案
如果所有输入特征均具有均值0
和标准差1
,则神经网络的效果最佳.如果特征具有非常不同的标准偏差,则神经网络的性能将非常差.因此,h20
为您执行了此归一化.换句话说,甚至在训练您的网络之前,它都会计算您拥有的所有特征的均值和标准差,并用(x - mean) / stddev
替换原始值.在您的情况下,第二个特征的stddev
比第一个特征的stddev
小10倍,因此在归一化之后,就其对总和的贡献以及权重向隐藏神经元的权重而言,值最终变得更重要10倍.需要取消它.这就是第二个功能的权重要小10倍的原因.
Neural networks perform best if all the input features have mean 0
and standard deviation 1
. If the features have very different standard deviations, neural networks perform very poorly. Because of that h20
does this normalization for you. In other words, before even training your net it computes mean and standard deviation of all the features you have, and replaces the original values with (x - mean) / stddev
. In your case the stddev
for the second feature is 10x smaller than for the first, so after the normalization the values end up being 10x more important in terms of how much they contribute to the sum, and the weights heading to the hidden neuron need to cancel it out. That's why the weight for the second feature is 10x smaller.
这篇关于h2o深度学习权重和归一化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!