在Keras中实施注意力 [英] Implementing Attention in Keras

查看:89
本文介绍了在Keras中实施注意力的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过一个简单的lstm在keras中实现关注:

I am trying to implement attention in keras over a simple lstm:

model_2_input = Input(shape=(500,))
#model_2 = Conv1D(100, 10, activation='relu')(model_2_input)
model_2 = Dense(64, activation='sigmoid')(model_2_input)
model_2 = Dense(64, activation='sigmoid')(model_2)

model_1_input = Input(shape=(None, 2048))
model_1 = LSTM(64, dropout_U = 0.2, dropout_W = 0.2, return_sequences=True)(model_1_input)
model_1, state_h, state_c = LSTM(16, dropout_U = 0.2, dropout_W = 0.2, return_sequences=True, return_state=True)(model_1) # dropout_U = 0.2, dropout_W = 0.2,


#print(state_c.shape)
match = dot([model_1, state_h], axes=(0, 0))
match = Activation('softmax')(match)
match = dot([match, state_h], axes=(0, 0))
print(match.shape)

merged = concatenate([model_2, match], axis=1)
print(merged.shape)
merged = Dense(4, activation='softmax')(merged)
print(merged.shape)
model = Model(inputs=[model_2_input , model_1_input], outputs=merged)
adam = Adam()
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

我遇到了以下错误:

merged = concatenate([model_2, match], axis=1)

'输入了形状:%s'%(input_shape))ValueError:A Concatenate 图层需要输入的形状与concat轴一致,但concat轴除外. 得到了输入形状:[[None,64),(16,1)]

'Got inputs shapes: %s' % (input_shape)) ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 64), (16, 1)]

实现非常简单,只需将lstm输出的点积与隐藏状态一起使用,并将其用作加权函数即可计算隐藏状态本身.

The implementation is very simple, just take dot product of lstm output and with the hidden states and use it as weighing function to compute the hidden state itself.

如何解决该错误?尤其是如何使注意力概念发挥作用?

How to resolve the error? Especially how to get the attention concept working?

推荐答案

您可以在串联之前添加一个Reshape图层以确保兼容性. 请参见keras文档此处. 最好重塑model_2输出(None, 64)

You can add a Reshape layer before concatenating to ensure compatibility. see keras documentation here. Probably best to reshape the model_2 output (None, 64)

基本上,您需要在连接之前添加具有目标形状的Reshape图层:

Essentially you need to add a Reshape layer with the target shape before concatenating:

model_2 = Reshape(new_shape)(model_2)

这将返回(batch_size, (new_shape)) 当然,您可以重塑网络的任一分支,只需使用model_2输出,因为它是一个更简单的示例

This will return (batch_size, (new_shape)) You can of course Reshape either branch of your network, just using model_2 output as it is a simpler example

话虽如此,也许值得重新考虑您的网络结构.特别是,此问题源于第二个点层(仅给您16个标量).因此,很难重塑形状以使两个分支匹配.

Having said that, maybe it's worth rethinking your network structure. In particular, this problem stems from the second dot layer (which gives you 16 scalars only). As such it's hard to reshape so that the two branches match.

在不知道模型试图预测什么或训练数据如何的情况下,很难评论是否需要两个点,但是潜在的重组将解决这个问题.

Without knowing what the model is trying to predict or what the training data looks like, it's hard to comment on whether two dots are necessary or not, but potentially re-structuring will solve this issue.

这篇关于在Keras中实施注意力的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆