h2o深度学习权重和归一化 [英] h2o deep learning weights and normalization

查看:229
本文介绍了h2o深度学习权重和归一化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过R界面探索水,并且我得到了一个奇怪的权重矩阵.我的任务非常简单:给定x,y,计算x + y.
我有214行3列.第一列(x)从(-1000,1000)均匀绘制,第二列(y)从(-100,100)均匀绘制.我只想将它们结合起来,所以我只有一个隐藏层和一个神经元. 这是我的代码:

I'm exploring h2o via the R interface and I'm getting a weird weight matrix. My task is as simple as they get: given x,y compute x+y.
I have 214 rows with 3 columns. The first column(x) was drawn uniformly from (-1000, 1000) and the second one(y) from (-100,100). I just want to combine them so I have a single hidden layer with a single neuron. This is my code:

library(h2o)
localH2O = h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
train <- h2o.importFile(path = "/home/martin/projects/R NN Addition/addition.csv")
model <- h2o.deeplearning(1:2,3,train, hidden = c(1), epochs=200, export_weights_and_biases=T, nfolds=5)
print(h2o.weights(model,1))
print(h2o.weights(model,2))

结果是

> print(h2o.weights(model,1))
          x          y
1 0.5586579 0.05518193

[1 row x 2 columns] 
> print(h2o.weights(model,2))
        C1
1 1.802469

由于某些原因,y的权重值比x的权重值低0.055到10倍.因此,最后,神经网络将计算出x + y/10.但是,h2o.predict实际上返回正确的值(即使在测试集上也是如此).
我猜想有一个预处理步骤可以以某种方式缩放我的数据.有什么方法可以重现模型产生的实际权重?我希望能够可视化一些非常简单的神经网络.

For some reason the weight value for y is 0.055 - 10 times lower than for x. So, in the end the neural net would compute x+y/10. However, h2o.predict actually returns the correct values (even on a test set).
I'm guessing there's a preprocessing step that's somehow scaling my data. Is there any way I can reproduce the actual weights produced by the model? I would like to be able to visualize some pretty simple neural networks.

推荐答案

如果所有输入特征均具有均值0和标准差1,则神经网络的效果最佳.如果特征具有非常不同的标准偏差,则神经网络的性能将非常差.因此,h20为您执行了此归一化.换句话说,甚至在训练您的网络之前,它都会计算您拥有的所有特征的均值和标准差,并用(x - mean) / stddev替换原始值.在您的情况下,第二个特征的stddev比第一个特征的stddev小10倍,因此在归一化之后,就其对总和的贡献以及权重向隐藏神经元的权重而言,值最终变得更重要10倍.需要取消它.这就是第二个功能的权重要小10倍的原因.

Neural networks perform best if all the input features have mean 0 and standard deviation 1. If the features have very different standard deviations, neural networks perform very poorly. Because of that h20 does this normalization for you. In other words, before even training your net it computes mean and standard deviation of all the features you have, and replaces the original values with (x - mean) / stddev. In your case the stddev for the second feature is 10x smaller than for the first, so after the normalization the values end up being 10x more important in terms of how much they contribute to the sum, and the weights heading to the hidden neuron need to cancel it out. That's why the weight for the second feature is 10x smaller.

这篇关于h2o深度学习权重和归一化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆