如何在Keras中仅获取序列模型的最后输出? [英] How to get only last output of sequence model in Keras?

查看:397
本文介绍了如何在Keras中仅获取序列模型的最后输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Keras中使用 return_sequences = True TimeDistributed 包装器在Keras中训练了多对多序列模型最后一个密集层:

I trained a Many-to-Many sequence model in Keras with return_sequences=True and TimeDistributed wrapper on the last Dense layer:

model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=50))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))
# train...
model.save_weights("weights.h5")

因此,在训练过程中,损失是针对所有隐藏状态(在每个时间戳记中)进行计算的。但出于推断,我只需要在最后一个时间戳记上获取输出。因此,我将权重加载到多对一序列模型中,以进行 TimeDistributed 包装器的推理,并设置 return_sequences = False 以获得LSTM层的最后输出:

So during the training the loss is calculated over all hidden states (in every timestamp). But for inference I only need the get output on the last timestamp. So I load the weights into Many-to-One sequence model for inference without TimeDistributed wrapper and I set return_sequences=False to get only last output of the LSTM layer:

inference_model = Sequential()
inference_model.add(Embedding(input_dim=vocab_size, output_dim=50))
inference_model.add(LSTM(100, return_sequences=False))
inference_model.add(Dense(vocab_size, activation='softmax'))

inference_model.load_weights("weights.h5")

当我在序列上测试推理模型时长度为20的我希望获得形状为(vocab_size)的预测,但 inference_model.predict(...)仍会为每个时间戳返回预测-形状的张量(20, vocab_size)

When I test my inference model on a sequence with length 20 I expect to get a prediction with shape (vocab_size) but inference_model.predict(...) still returns predictions for every timestamp - a tensor of shape (20, vocab_size)

推荐答案

如果出于某种原因,您只需要在推理过程中使用最后一个时间步,就可以构建一个新模型输入的训练模型并返回最后一个使用 Lambda 层作为时间步的输出:

If, for whatever reason, you need only the last timestep during inference, you can build a new model which applies the trained model on the input and returns the last timestep as its output using the Lambda layer:

from keras.models import Model
from keras.layers import Input, Lambda

inp = Input(shape=put_the_input_shape_here)
x = model(inp) # apply trained model on the input
out = Lambda(lambda x: x[:,-1])(x)

inference_model = Model(inp, out)






侧面说明:此答案 TimeDistributed(Dense(...)) Dense(...) 是等效的,因为 Dense 图层应用于其输入张量的最后一个维度。因此,这就是为什么您获得相同的输出形状。


Side Note: As already stated in this answer, TimeDistributed(Dense(...)) and Dense(...) are equivalent, since Dense layer is applied on the last dimension of its input Tensor. Hence, that's why you get the same output shape.

这篇关于如何在Keras中仅获取序列模型的最后输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆