用Keras训练LSTM时出现奇怪的损耗曲线 [英] Strange loss curve while training LSTM with Keras

查看:355
本文介绍了用Keras训练LSTM时出现奇怪的损耗曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试针对一些二进制分类问题训练LSTM.当我在训练后绘制loss曲线时,里面有奇怪的提示.以下是一些示例:

I'm trying to train an LSTM for some a binary classification problem. When I plot loss curve after the training, there are strange picks in it. Here are some examples:

这是基本代码

model = Sequential()
model.add(recurrent.LSTM(128, input_shape = (columnCount,1), return_sequences=True))
model.add(Dropout(0.5))
model.add(recurrent.LSTM(128, return_sequences=False))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(optimizer='adam', 
             loss='binary_crossentropy', 
             metrics=['accuracy'])

new_train = X_train[..., newaxis]

history = model.fit(new_train, y_train, nb_epoch=500, batch_size=100, 
                    callbacks = [EarlyStopping(monitor='val_loss', min_delta=0.0001, patience=2, verbose=0, mode='auto'), 
                                 ModelCheckpoint(filepath="model.h5", verbose=0, save_best_only=True)],
                    validation_split=0.1)

# list all data in history
print(history.history.keys())
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

我不明白为什么会出现选秀权?有什么想法吗?

I don't understand why do that picks occur? Any ideas?

推荐答案

发生这种情况的可能性有很多:

There are many possibilities why something like this occurs:

  1. 您的参数轨迹改变了它的吸引力盆地-这意味着您的系统留下了稳定的轨迹并切换到另一轨迹.这可能是由于像批量抽样或辍学.

  1. Your parameters trajectory changed its basin of attraction - this means that your system left a stable trajectory and switched to another one. This was probably due to randomization like e.g. batch sampling or dropout.

LSTM不稳定性- LSTM 非常不稳定.也有报道说,稳定他们经常很耗时.

LSTM instability- LSTMs are believed to be extremely unstable in terms of training. It was also reported that very often it's really time consuming for them to stabilize.

由于最新的研究(例如,来自此处),我建议您减小批量大小并保留更多时间.我也会尝试检查是否就需要学习的模式数量而言,网络的拓扑结构并不复杂(或简单).我还会尝试切换到GRUSimpleRNN.

Due to the latest research (e.g. from here) I would recommend you decreasing the batch size and leaving it for more epochs. I would also try to check if e.g. topology of a network is not to complexed (or plain) in terms of amount of patterns it need to learn. I would also try switch to either GRU or SimpleRNN.

这篇关于用Keras训练LSTM时出现奇怪的损耗曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆