训练回归网络时的 NaN 损失 [英] NaN loss when training regression network

查看：31 发布时间：2021/12/19 12:12:36 python keras neural-network theano loss-function

本文介绍了训练回归网络时的 NaN 损失的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个one-hot encoding"(全是 1 和 0)的数据矩阵，有 260,000 行和 35 列.我正在使用 Keras 训练一个简单的神经网络来预测连续变量.制作网络的代码如下:

I have a data matrix in "one-hot encoding" (all ones and zeros) with 260,000 rows and 35 columns. I am using Keras to train a simple neural network to predict a continuous variable. The code to make the network is the following:

model = Sequential()
model.add(Dense(1024, input_shape=(n_train,)))
model.add(Activation('relu'))
model.add(Dropout(0.1))

model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.1))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Dense(1))

sgd = SGD(lr=0.01, nesterov=True);
#rms = RMSprop()
#model.compile(loss='categorical_crossentropy', optimizer=rms, metrics=['accuracy'])
model.compile(loss='mean_absolute_error', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=3, verbose=1, validation_data=(X_test,Y_test), callbacks=[EarlyStopping(monitor='val_loss', patience=4)] )

然而，在训练过程中，我看到损失下降得很好，但在第二个时期的中间，它变成了 nan:

However, during the training process, I see the loss decrease nicely, but during the middle of the second epoch, it goes to nan:

Train on 260000 samples, validate on 64905 samples
Epoch 1/3
260000/260000 [==============================] - 254s - loss: 16.2775 - val_loss:
 13.4925
Epoch 2/3
 88448/260000 [=========>....................] - ETA: 161s - loss: nan

我尝试使用 RMSProp 而不是 SGD，我尝试了 tanh 而不是 relu，我尝试了和没有辍学，一切都无济于事.我尝试了一个较小的模型，即只有一个隐藏层，并且存在相同的问题(在不同的点变成 nan).但是，它确实可以使用较少的功能，即如果只有 5 列，并且可以提供非常好的预测.似乎有某种溢出，但我无法想象为什么 - 损失根本没有不合理的大.

I tried using RMSProp instead of SGD, I tried tanh instead of relu, I tried with and without dropout, all to no avail. I tried with a smaller model, i.e. with only one hidden layer, and same issue (it becomes nan at a different point). However, it does work with less features, i.e. if there are only 5 columns, and gives quite good predictions. It seems to be there is some kind of overflow, but I can't imagine why--the loss is not unreasonably large at all.

Python 版本 2.7.11，在 linux 机器上运行，仅 CPU.我用最新版本的 Theano 测试了它，我也得到了 Nans，所以我尝试去 Theano 0.8.2 并遇到同样的问题.最新版本的 Keras 也有同样的问题，0.3.2 版本也是如此.

Python version 2.7.11, running on a linux machine, CPU only. I tested it with the latest version of Theano, and I also get Nans, so I tried going to Theano 0.8.2 and have the same problem. With the latest version of Keras has the same problem, and also with the 0.3.2 version.

训练回归网络时的 NaN 损失 [英] NaN loss when training regression network

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

训练回归网络时的 NaN 损失 [英] NaN loss when training regression network

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭