在Keras elmo嵌入层中有0个参数吗?这正常吗? [英] In Keras elmo embedding layer has 0 parameters? is this normal?
问题描述
因此我在模型中使用了GloVe,并且可以使用,但是现在我改用了Elmo(引用了Keras代码在GitHub utils.py
So I was using GloVe with my model and it worked, but now I changed to Elmo (reference that Keras code available on GitHub Elmo Keras Github, utils.py
但是,当我打印model.summary时,我在ELMo嵌入层中得到了0个参数,这与我使用手套时不同.那是正常的吗?如果不能,请告诉我我在做什么错 使用手套我获得了超过2000万个参数
however, when I print model.summary I get 0 parameters in the ELMo Embedding layer unlike when I was using Glove is that normal ? If not can you please tell me what am I doing wrong Using glove I Got over 20Million parameters
##--------> When I was using Glove Embedding Layer
word_embedding_layer = emb.get_keras_embedding(#dropout = emb_dropout,
trainable = True,
input_length = sent_maxlen,
name='word_embedding_layer')
## --------> Deep layers
pos_embedding_layer = Embedding(output_dim =pos_tag_embedding_size, #5
input_dim = len(SPACY_POS_TAGS),
input_length = sent_maxlen, #20
name='pos_embedding_layer')
latent_layers = stack_latent_layers(num_of_latent_layers)
##--------> 6] Dropout
dropout = Dropout(0.1)
## --------> 7]Prediction
predict_layer = predict_classes()
## --------> 8] Prepare input features, and indicate how to embed them
inputs = [Input((sent_maxlen,), dtype='int32', name='word_inputs'),
Input((sent_maxlen,), dtype='int32', name='predicate_inputs'),
Input((sent_maxlen,), dtype='int32', name='postags_inputs')]
## --------> 9] ELMo Embedding and Concat all inputs and run on deep network
from elmo import ELMoEmbedding
import utils
idx2word = utils.get_idx2word()
ELmoembedding1 = ELMoEmbedding(idx2word=idx2word, output_mode="elmo", trainable=True)(inputs[0]) # These two are interchangeable
ELmoembedding2 = ELMoEmbedding(idx2word=idx2word, output_mode="elmo", trainable=True)(inputs[1]) # These two are interchangeable
embeddings = [ELmoembedding1,
ELmoembedding2,
pos_embedding_layer(inputs[3])]
con1 = keras.layers.concatenate(embeddings)
## --------> 10]Build model
outputI = predict_layer(dropout(latent_layers(con1)))
model = Model(inputs, outputI)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
model.summary()
试用:
注意:我尝试将TF-Hub Elmo与Keras代码一起使用,但是输出始终是2D张量[即使当我将其更改为'Elmo'设置和'LSTM'而不是默认值'],所以我无法与POS_embedding_layer串联.我尝试重塑,但最终我遇到了相同的问题,总参数0.
Trials:
note: I tried using the TF-Hub Elmo with Keras code, but the output was always a 2D tensor [even when I changed it to 'Elmo' setting and 'LSTM' instead of default']so I couldn't Concatenate with POS_embedding_layer. I tried reshaping but eventually I got the same issue total Parameters 0.
推荐答案
摘自TF-Hub描述( https://tfhub.dev/google/elmo/2 ),单个单词的嵌入是不可训练的.只有嵌入层和LSTM层的加权总和才是.因此,您应该在ELMo级别获得4个可训练的参数.
From the TF-Hub description (https://tfhub.dev/google/elmo/2), the embeddings of individual words are not trainable. Only the weighted sum of the embedding and LSTM layers are. So you should get 4 trainable parameters at the ELMo level.
我能够使用在Github上使用StrongIO的示例中定义的类来获取可训练的参数 .该示例仅提供一个输出为 default 层的类,每个输入示例(本质上是文档/句子编码器)的输出为1024矢量.要访问每个单词的嵌入( elmo 层),需要进行一些更改,如
I was able to get the trainable parameters using the class defined in StrongIO's example on Github. The example only provides a class where the output is the default layer, which is a 1024 vector for each input example (essentially a document/sentence encoder). To access the embeddings of each word (the elmo layer), a few changes are needed as suggested in this issue:
class ElmoEmbeddingLayer(Layer):
def __init__(self, **kwargs):
self.dimensions = 1024
self.trainable=True
super(ElmoEmbeddingLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.elmo = hub.Module('https://tfhub.dev/google/elmo/2', trainable=self.trainable,
name="{}_module".format(self.name))
self.trainable_weights += K.tf.trainable_variables(scope="^{}_module/.*".format(self.name))
super(ElmoEmbeddingLayer, self).build(input_shape)
def call(self, x, mask=None):
result = self.elmo(
K.squeeze(
K.cast(x, tf.string), axis=1
),
as_dict=True,
signature='default',
)['elmo']
return result
def compute_output_shape(self, input_shape):
return (input_shape[0], None, self.dimensions)
您可以将ElmoEmbeddingLayer与POS层堆叠在一起.
作为更一般的示例,可以使用一维ConvNet模型中的ELMo嵌入进行分类:
You can stack the ElmoEmbeddingLayer with the POS layer.
As a more general example, one can use the ELMo embeddings in a 1D ConvNet model for classification:
elmo_input_layer = Input(shape=(None, ), dtype="string")
elmo_output_layer = ElmoEmbeddingLayer()(elmo_input_layer)
conv_layer = Conv1D(
filters=100,
kernel_size=3,
padding='valid',
activation='relu',
strides=1)(elmo_output_layer)
pool_layer = GlobalMaxPooling1D()(conv_layer)
dense_layer = Dense(32)(pool_layer)
output_layer = Dense(1, activation='sigmoid')(dense_layer)
model = Model(
inputs=elmo_input_layer,
outputs=output_layer)
model.summary()
模型摘要如下所示:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_62 (InputLayer) (None, None) 0
_________________________________________________________________
elmo_embedding_layer_13 (Elm (None, None, 1024) 4
_________________________________________________________________
conv1d_46 (Conv1D) (None, None, 100) 307300
_________________________________________________________________
global_max_pooling1d_42 (Glo (None, 100) 0
_________________________________________________________________
dense_53 (Dense) (None, 32) 3232
_________________________________________________________________
dense_54 (Dense) (None, 1) 33
=================================================================
Total params: 310,569
Trainable params: 310,569
Non-trainable params: 0
_________________________________________________________________
这篇关于在Keras elmo嵌入层中有0个参数吗?这正常吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!