Keras:用于嵌入/向量的附加层? [英] Keras : addition layer for embeddings / vectors?
问题描述
我有3个单词嵌入物:
- 嵌入#1:[w11,w12,w13,w14]
- 嵌入#2:[w21,w22,w23,w24]
- 嵌入#3:[w31,w32,w33,w34]
是否有一种方法可以通过将所有三个向量相加,并同时使用所有向量的可训练权重来进行第四次嵌入,例如:
Is there a way to get a fourth embedding by adding all three vectors, with the trainable weights from all of them, like:
- 嵌入#4:[w11 + w21 + w31,w12 + w22 + w32,w13 + w23 + w33,w14 + w24 + w34]
?有没有办法在keras层中做到这一点?
? Is there a way to do this in a keras layer?
问题
我想学习印尼语的词嵌入.我计划通过使用LSTM训练序列预测机来做到这一点.
Problem
I want to learn the word embeddings for Indonesian language. I plan to do this by training a sequence prediction machine using LSTMs.
但是,印尼语的语法不同于英语.特别是在印尼语中,您可以使用前缀和后缀来修改单词.名词词加上前缀可以变成动词,而后缀则变成形容词.您可以在一个单词中放入这么多单词,以便单个基本单词可以有5个或更多的变体.
However, the grammar of Indonesian language is different from english. Especially, in Indonesian, you can modify a word using prefixes and suffixes. A noun word when given a prefix can become a verb, and when given a suffix can become an adjective. You can put so many into one word, so that a single base word can have 5 or more variations.
例如:
- tani的意思是农场(动词)
- pe-tani是农民
- 每tani-an表示农场(名词)
- ber-tani的意思是农场(动词,含义略有不同)
通过在单词之间添加前缀来完成的语义转换在单词之间是一致的.例如:
The transformation of semantic done by appending a prefix to a word is consistent between words. For example :
- pe-tani对tani来说是pe-layan对莱昂,什么pe-layar对莱昂,什么pe-tembak对Tembak,依此类推. 每个人对主要人而言,每个人对上师而言是什么,对基拉人而言对基拉来说是什么,对苏拉人而言对苏拉特来说是什么,依此类推.
- pe-tani is to tani is what pe-layan is to layan, what pe-layar is to layar, what pe-tembak is to tembak, and so on.
- per-main-an is to main is what per-guru-an is to guru, what per-kira-an is to kira, what per-surat-an is to surat, and so on.
因此,我计划将前缀和后缀表示为嵌入,这将用于对基本单词的嵌入进行添加,从而产生新的嵌入.因此,复合词的含义是从基本词和词缀的嵌入中得出的,而不是存储为单独的嵌入.但是我不知道如何在Keras层中执行此操作.如果以前曾有人问过,我找不到它.
Therefore, i plan to represent the prefixes and suffixes as embeddings, which would be used to do an addition to the base word's embedding, producing a new embedding. So the meaning of the composite word is derived from the embeddings of the base word and the affixes, not stored as a separate embeddings. However i don't know how to do this in a Keras layer. If it had been asked before, i cannot find it.
推荐答案
当您说三个单词嵌入"时,我看到三个嵌入层,例如:
When you say "three word embeddings", I see three Embedding layers, such as:
input1 = Input((sentenceLength,))
input2 = Input((sentenceLength,))
input3 = Input((sentenceLength,))
emb1 = Embedding(...options...)(input1)
emb2 = Embedding(...options...)(input2)
emb3 = Embedding(...options...)(input3)
您可以使用简单的Add()
层对这三个层求和:
You can use a simple Add()
layer to sum the three:
summed = Add()([emb1,emb2,emb3])
然后您继续建模...
Then you continue your modeling...
#after creating the rest of the layers and getting the desired output:
model = Model([input1,input2,input3],output)
如果您不使用嵌入层,但是要输入三个向量:
If you're not using embedding layers, but you're inputting three vectors:
input1 = Input((4,)) #or perhaps (sentenceLength,4)
input2 = Input((4,))
input3 = Input((4,))
added = Add()([input1,input2,input3])
其余的都一样.
如果这不是您的问题,请提供三个单词嵌入"的来源,打算如何选择它们的详细信息,等等.
If this is not your question, please give more details about where the three "word embeddings" are coming from, how you intend to select them, etc.
这篇关于Keras:用于嵌入/向量的附加层?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!