如何在Keras中的LSTM层中解释权重 [英] How to interpret weights in a LSTM layer in Keras
问题描述
我目前正在使用LSTM层训练用于天气预报的循环神经网络.网络本身非常简单,大致如下所示:
I'm currently training a recurrent neural network for weather forecasting, using a LSTM layer. The network itself is pretty simple and looks roughly like this:
model = Sequential()
model.add(LSTM(hidden_neurons, input_shape=(time_steps, feature_count), return_sequences=False))
model.add(Dense(feature_count))
model.add(Activation("linear"))
LSTM层的权重确实具有以下形状:
The weights of the LSTM layer do have the following shapes:
for weight in model.get_weights(): # weights from Dense layer omitted
print(weight.shape)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
简而言之,看起来LSTM层中有四个元素".我现在想知道如何解释它们:
In short, it looks like there are four "elements" in this LSTM layer. I'm wondering now how to interpret them:
-
此表示中的
time_steps
参数在哪里?它如何影响重量?
Where is the
time_steps
parameter in this representation? How does it influence the weights?
我读到LSTM由几个块组成,例如输入和忘记门.如果这些表示在这些权重矩阵中,那么哪个矩阵属于哪个门?
I've read that a LSTM consists of several blocks, like an input and a forget gate. If those are represented in these weight matrices, which matrix belongs to which gate?
有什么方法可以查看网络学到的知识吗?例如,从上一个时间步(如果我们要预测t
,则从t-1
开始)要花费多少,从t-2
开始要花费多少?知道我们是否可以从权重中读取输入t-5
完全不相关,这将是很有趣的.
Is there any way to see what the network has learned? For example, how much does it take from the last time step (t-1
if we want to forecast t
) and how much from t-2
etc? It would be interesting to know if we could read from the weights that the input t-5
is completely irrelevant, for example.
澄清和提示将不胜感激.
Clarifications and hints would be greatly appreciated.
推荐答案
如果您使用的是 Keras 2.2.0
打印时
print(model.layers[0].trainable_weights)
您应该看到三个张量:lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
每个张量的维度之一应该是
you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of
4 *单位数量
4 * number_of_units
其中 number_of_units 是您的神经元数量.试试:
where number_of_units is your number of neurons. Try:
units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)
这是因为每个张量包含四个LSTM单位(按该顺序)的权重:
That is because each tensor contains weights for four LSTM units (in that order):
i(输入),f(忘记),c(单元格状态)和o(输出)
因此,为了提取权重,您可以简单地使用切片运算符:
Therefore in order to extract weights you can simply use slice operator:
W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]
W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]
U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]
b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]
来源: keras代码
这篇关于如何在Keras中的LSTM层中解释权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!