训练回归网络时NaN损失 [英] NaN loss when training regression network

查看：145 发布时间：2020/4/25 9:36:14 python theano keras

本文介绍了训练回归网络时NaN损失的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个单热编码"(全1和全0)的数据矩阵，具有260,000行和35列.我正在使用Keras训练简单的神经网络来预测连续变量.建立网络的代码如下:

I have a data matrix in "one-hot encoding" (all ones and zeros) with 260,000 rows and 35 columns. I am using Keras to train a simple neural network to predict a continuous variable. The code to make the network is the following:

model = Sequential()
model.add(Dense(1024, input_shape=(n_train,)))
model.add(Activation('relu'))
model.add(Dropout(0.1))

model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.1))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Dense(1))

sgd = SGD(lr=0.01, nesterov=True);
#rms = RMSprop()
#model.compile(loss='categorical_crossentropy', optimizer=rms, metrics=['accuracy'])
model.compile(loss='mean_absolute_error', optimizer=sgd)
model.fit(X_train, Y_train, batch_size=32, nb_epoch=3, verbose=1, validation_data=(X_test,Y_test), callbacks=[EarlyStopping(monitor='val_loss', patience=4)] )

但是，在训练过程中，我看到损失减少得很好，但是在第二个时期的中间，它就变成了nan:

However, during the training process, I see the loss decrease nicely, but during the middle of the second epoch, it goes to nan:

Train on 260000 samples, validate on 64905 samples
Epoch 1/3
260000/260000 [==============================] - 254s - loss: 16.2775 - val_loss:
 13.4925
Epoch 2/3
 88448/260000 [=========>....................] - ETA: 161s - loss: nan

我尝试使用RMSProp而不是SGD，我尝试了tanh而不是relu，尝试了是否有辍学，都无济于事.我尝试了一种较小的模型，即仅具有一个隐藏层，并且存在相同的问题(在不同的点变得难解难分).但是，它确实具有较少的功能，即如果只有5列，并且给出了很好的预测.似乎有某种溢出，但我无法想象为什么-损失根本不是不合理的大.

I tried using RMSProp instead of SGD, I tried tanh instead of relu, I tried with and without dropout, all to no avail. I tried with a smaller model, i.e. with only one hidden layer, and same issue (it becomes nan at a different point). However, it does work with less features, i.e. if there are only 5 columns, and gives quite good predictions. It seems to be there is some kind of overflow, but I can't imagine why--the loss is not unreasonably large at all.

Python版本2.7.11，仅在CPU上的Linux机器上运行.我使用最新版的Theano进行了测试，并且我也得到了Nans，因此我尝试使用Theano 0.8.2并遇到了同样的问题.与Keras的最新版本具有相同的问题，并且与0.3.2版本也存在相同的问题.

Python version 2.7.11, running on a linux machine, CPU only. I tested it with the latest version of Theano, and I also get Nans, so I tried going to Theano 0.8.2 and have the same problem. With the latest version of Keras has the same problem, and also with the 0.3.2 version.

训练回归网络时NaN损失 [英] NaN loss when training regression network

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

训练回归网络时NaN损失 [英] NaN loss when training regression network

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭