如何在Keras中的LSTM层中解释权重 [英] How to interpret weights in a LSTM layer in Keras

查看:633
本文介绍了如何在Keras中的LSTM层中解释权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用LSTM层训练用于天气预报的循环神经网络.网络本身非常简单,大致如下所示:

I'm currently training a recurrent neural network for weather forecasting, using a LSTM layer. The network itself is pretty simple and looks roughly like this:

model = Sequential()  
model.add(LSTM(hidden_neurons, input_shape=(time_steps, feature_count), return_sequences=False))  
model.add(Dense(feature_count))  
model.add(Activation("linear"))  

LSTM层的权重确实具有以下形状:

The weights of the LSTM layer do have the following shapes:

for weight in model.get_weights(): # weights from Dense layer omitted
    print(weight.shape)

> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)
> (feature_count, hidden_neurons)
> (hidden_neurons, hidden_neurons)
> (hidden_neurons,)

简而言之,看起来LSTM层中有四个元素".我现在想知道如何解释它们:

In short, it looks like there are four "elements" in this LSTM layer. I'm wondering now how to interpret them:

  • 此表示中的time_steps参数在哪里?它如何影响重量?

  • Where is the time_steps parameter in this representation? How does it influence the weights?

我读到LSTM由几个块组成,例如输入和忘记门.如果这些表示在这些权重矩阵中,那么哪个矩阵属于哪个门?

I've read that a LSTM consists of several blocks, like an input and a forget gate. If those are represented in these weight matrices, which matrix belongs to which gate?

有什么方法可以查看网络学到的知识吗?例如,从上一个时间步(如果我们要预测t,则从t-1开始)要花费多少,从t-2开始要花费多少?知道我们是否可以从权重中读取输入t-5完全不相关,这将是很有趣的.

Is there any way to see what the network has learned? For example, how much does it take from the last time step (t-1 if we want to forecast t) and how much from t-2 etc? It would be interesting to know if we could read from the weights that the input t-5 is completely irrelevant, for example.

澄清和提示将不胜感激.

Clarifications and hints would be greatly appreciated.

推荐答案

如果您使用的是 Keras 2.2.0

打印时

print(model.layers[0].trainable_weights)

您应该看到三个张量:lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0 每个张量的维度之一应该是

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0 One of the dimensions of each tensor should be a product of

4 *单位数量

4 * number_of_units

其中 number_of_units 是您的神经元数量.试试:

where number_of_units is your number of neurons. Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

这是因为每个张量包含四个LSTM单位(按该顺序)的权重:

That is because each tensor contains weights for four LSTM units (in that order):

i(输入),f(忘记),c(单元格状态)和o(输出)

因此,为了提取权重,您可以简单地使用切片运算符:

Therefore in order to extract weights you can simply use slice operator:

W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]

W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]

U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]

b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]

来源: keras代码

这篇关于如何在Keras中的LSTM层中解释权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆