keras lstm层中的多个内核是什么意思? [英] What is the meaning of multiple kernels in keras lstm layer?
问题描述
在 https://keras.io/layers/recurrent/上,我看到LSTM层具有kernel
和recurrent_kernel
.它们是什么意思?以我的理解,我们需要一个LSTM单元的4个门的权重.但是,在keras实现中,kernel
的形状为(input_dim,4 * units),recurrent_kernel
的形状为(units,4 * units).那么,他们两个都以某种方式实现了大门吗?
On https://keras.io/layers/recurrent/ I see that LSTM layers have a kernel
and a recurrent_kernel
. What is their meaning? In my understanding, we need weights for the 4 gates of an LSTM cell. However, in keras implementation, kernel
has a shape of (input_dim, 4*units) and recurrent_kernel
has a shape of (units, 4*units). So, are both of them somehow implementing the gates?
推荐答案
如果我错了,请纠正我,但是如果您看一下LSTM方程式:
Correct me if I'm wrong, but if you take a look at the LSTM equations:
您有4个转换输入的 W 矩阵和4个转换隐藏状态的 U 矩阵.
You have 4 W matrices that transform the input and 4 U matrices that transform the hidden state.
Keras将这4组矩阵保存到kernel
和recurrent_kernel
权重数组中.从使用它们的代码:
Keras saves these sets of 4 matrices into the kernel
and recurrent_kernel
weight arrays. From the code that uses them:
self.kernel_i = self.kernel[:, :self.units]
self.kernel_f = self.kernel[:, self.units: self.units * 2]
self.kernel_c = self.kernel[:, self.units * 2: self.units * 3]
self.kernel_o = self.kernel[:, self.units * 3:]
self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units]
self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2]
self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3]
self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:]
显然,这4个矩阵存储在沿第二维连接的权重数组中,这解释了权重数组的形状.
Apparently the 4 matrices are stored inside the weight arrays concatenated along the second dimension, which explains the weight array shapes.
这篇关于keras lstm层中的多个内核是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!