具有可变时间步长的RNN的Keras掩蔽 [英] Keras Masking for RNN with Varying Time Steps
问题描述
我正在尝试使用具有不同时间长度的序列在Keras中拟合RNN.我的数据位于格式为(sample, time, feature) = (20631, max_time, 24)
的Numpy数组中,其中max_time
在运行时确定为具有最多时间戳记的样本可用的时间步数.我用每个时间序列的开始都用0
填充,除了最长的时间显然是这样.
I'm trying to fit an RNN in Keras using sequences that have varying time lengths. My data is in a Numpy array with format (sample, time, feature) = (20631, max_time, 24)
where max_time
is determined at run-time as the number of time steps available for the sample with the most time stamps. I've padded the beginning of each time series with 0
, except for the longest one, obviously.
我最初是这样定义模型的...
I've initially defined my model like so...
model = Sequential()
model.add(Masking(mask_value=0., input_shape=(max_time, 24)))
model.add(LSTM(100, input_dim=24))
model.add(Dense(2))
model.add(Activation(activate))
model.compile(loss=weibull_loglik_discrete, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, nb_epoch=100, batch_size=1000, verbose=2, validation_data=(test_x, test_y))
为完整起见,以下是损失函数的代码:
For completeness, here's the code for the loss function:
def weibull_loglik_discrete(y_true, ab_pred, name=None):
y_ = y_true[:, 0]
u_ = y_true[:, 1]
a_ = ab_pred[:, 0]
b_ = ab_pred[:, 1]
hazard0 = k.pow((y_ + 1e-35) / a_, b_)
hazard1 = k.pow((y_ + 1) / a_, b_)
return -1 * k.mean(u_ * k.log(k.exp(hazard1 - hazard0) - 1.0) - hazard1)
这是自定义激活功能的代码:
And here's the code for the custom activation function:
def activate(ab):
a = k.exp(ab[:, 0])
b = k.softplus(ab[:, 1])
a = k.reshape(a, (k.shape(a)[0], 1))
b = k.reshape(b, (k.shape(b)[0], 1))
return k.concatenate((a, b), axis=1)
当我拟合模型并做出一些测试预测时,测试集中的每个样本都会获得完全相同的预测,
When I fit the model and make some test predictions, every sample in the test set gets exactly the same prediction, which seems fishy.
如果我删除了遮罩层,情况会变得更好,这使我认为遮罩层有问题,但是据我所知,我完全遵循了文档.
Things get better if I remove the masking layer, which makes me think there's something wrong with the masking layer, but as far as I can tell, I've followed the documentation exactly.
掩膜层是否指定有误?我还想念其他东西吗?
Is there something mis-specified with the masking layer? Am I missing something else?
推荐答案
没有实际数据我无法验证,但是我在RNN上也有类似的经历.以我为例,归一化解决了这个问题.将归一化层添加到模型中.
I could not validate without actual data, but I had a similar experience with an RNN. In my case normalization solved the issue. Add a normalization layer to your model.
这篇关于具有可变时间步长的RNN的Keras掩蔽的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!