keras lstm层中的多个内核是什么意思? [英] What is the meaning of multiple kernels in keras lstm layer?

查看:229
本文介绍了keras lstm层中的多个内核是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

https://keras.io/layers/recurrent/上,我看到LSTM层具有kernelrecurrent_kernel.它们是什么意思?以我的理解,我们需要一个LSTM单元的4个门的权重.但是,在keras实现中,kernel的形状为(input_dim,4 * units),recurrent_kernel的形状为(units,4 * units).那么,他们两个都以某种方式实现了大门吗?

On https://keras.io/layers/recurrent/ I see that LSTM layers have a kernel and a recurrent_kernel. What is their meaning? In my understanding, we need weights for the 4 gates of an LSTM cell. However, in keras implementation, kernel has a shape of (input_dim, 4*units) and recurrent_kernel has a shape of (units, 4*units). So, are both of them somehow implementing the gates?

推荐答案

如果我错了,请纠正我,但是如果您看一下LSTM方程式:

Correct me if I'm wrong, but if you take a look at the LSTM equations:

您有4个转换输入的 W 矩阵和4个转换隐藏状态的 U 矩阵.

You have 4 W matrices that transform the input and 4 U matrices that transform the hidden state.

Keras将这4组矩阵保存到kernelrecurrent_kernel权重数组中.从使用它们的代码:

Keras saves these sets of 4 matrices into the kernel and recurrent_kernel weight arrays. From the code that uses them:

self.kernel_i = self.kernel[:, :self.units]
self.kernel_f = self.kernel[:, self.units: self.units * 2]
self.kernel_c = self.kernel[:, self.units * 2: self.units * 3]
self.kernel_o = self.kernel[:, self.units * 3:]

self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units]
self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2]
self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3]
self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:]

显然,这4个矩阵存储在沿第二维连接的权重数组中,这解释了权重数组的形状.

Apparently the 4 matrices are stored inside the weight arrays concatenated along the second dimension, which explains the weight array shapes.

这篇关于keras lstm层中的多个内核是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆