Tensorboard - 可视化 LSTM 的权重 [英] Tensorboard - visualize weights of LSTM

查看:52
本文介绍了Tensorboard - 可视化 LSTM 的权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用几个 LSTM 层来形成一个深度循环神经网络.我想在训练期间监控每个 LSTM 层的权重.但是,我不知道如何将 LSTM 层权重的摘要附加到 TensorBoard.

I am using several LSTM layers to form a deep recurrent neural network. I would like to monitor the weights of each LSTM layer during training. However, I couldn't find out how to attach summaries of the LSTM layer weights to TensorBoard.

关于如何做到这一点有什么建议吗?

Any suggestions on how this can be done?

代码如下:

cells = []

with tf.name_scope("cell_1"):
    cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
                input_keep_prob=self.input_dropout,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell1)

with tf.name_scope("cell_2"):
    cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
    cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell2)

with tf.name_scope("cell_3"):
    cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    # cell has no input dropout since previous cell already has output dropout
    cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell3)

cell = tf.contrib.rnn.MultiRNNCell(
    cells, state_is_tuple=True)

output, self.final_state = tf.nn.dynamic_rnn(
    cell,
    inputs=self.inputs,
    initial_state=self.init_state)

推荐答案

tf.contrib.rnn.LSTMCell 对象有一个 property 称为 variables 对此有效.只有一个技巧:该属性返回一个空列表,直到您的单元格通过 tf.nn.dynamic_rnn.至少在使用单个 LSTMCell 时是这种情况.我不能说 MultiRNNCell.所以我希望这会奏效:

tf.contrib.rnn.LSTMCell objects have a property called variables that works for this. There's just one trick: The property returns an empty list until your cell goes through tf.nn.dynamic_rnn. At least this is the case when using a single LSTMCell. I can't speak for MultiRNNCell. So I expect this would work:

output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
    one_kernel, one_bias = one_lstm_cell.variables
    # I think TensorBoard handles summaries with the same name fine.
    tf.summary.histogram("Kernel", one_kernel)
    tf.summary.histogram("Bias", one_bias)

然后你可能知道怎么做,但是

And then you probably know how to do it from there, but

summary_op = tf.summary.merge_all()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(
        "my/preferred/logdir/train", graph=tf.get_default_graph())
    for step in range(1, training_steps+1):
        ...
        _, step_summary = sess.run([train_op, summary_op])
        train_writer.add_summary(step_summary)

查看我上面链接的 TensorFlow 文档,还有一个 weights 属性.我不知道有什么区别,如果有的话.而且,variables 返回的顺序没有记录.我通过打印结果列表并查看变量名称来解决这个问题.

Looking at the TensorFlow documentation I linked above, there's also a weights property. I don't know the difference, if there is any. And, the order of the variables return isn't documented. I figured it out by printing the resulting list and looking at the variable names.

现在,根据 doc 它说它返回所有层变量.老实说,我不知道 MultiRNNCell 是如何工作的,所以我不能告诉你这些变量是否只属于 MultiRNNCell 或者它是否包含来自进入它的单元格的变量.无论哪种方式,知道属性存在应该是一个很好的提示!希望这会有所帮助.

Now, MultiRNNCell has the same variables property according to its doc and it says it returns all layer variables. I honestly don't know how MultiRNNCell works, so I cannot tell you whether these are variables belonging exclusively to MultiRNNCell or if it includes variables from the cells that go into it. Either way, knowing the property exists should be a nice tip! Hope this helps.

尽管 variables 已针对大多数(所有?)RNN 类进行了记录,但对于 DropoutWrapper 来说确实会中断.属性已记录从 r1.2 开始,但在 1.2 和 1.4 中访问该属性会导致异常(看起来像 1.3,但未经测试).具体来说,

Although variables is documented for most (all?) RNN classes, it does break for DropoutWrapper. The property has been documented since r1.2, but accessing the property causes an exception in 1.2 and 1.4 (and looks like 1.3, but untested). Specifically,

from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)

将抛出 AttributeError: 'DropoutWrapper' object has no attribute 'trainable'.从回溯(或长时间盯着 DropoutWrapper 源),我注意到 variables 是在 DropoutWrapper 的超级RNNCell 的超级.晕了吗?事实上,我们在这里找到了记录的 variables 属性.它返回(记录的)weights 属性.weights 属性返回(记录的)self.trainable_weights + self.non_trainable_weights 属性.最后是问题的根源:

will throw AttributeError: 'DropoutWrapper' object has no attribute 'trainable'. From the traceback (or a long stare at the DropoutWrapper source), I noticed that variables is implemented in DropoutWrapper's super RNNCell's super Layer. Dizzy yet? Indeed, we find the documented variables property here. It returns the (documented) weights property. The weights property returns the (documented) self.trainable_weights + self.non_trainable_weights properties. And finally the root of the problem:

@property
def trainable_weights(self):
    return self._trainable_weights if self.trainable else []

@property
def non_trainable_weights(self):
    if self.trainable:
        return self._non_trainable_weights
    else:
        return self._trainable_weights + self._non_trainable_weights

也就是说,variables 不适用于 DropoutWrapper 实例.trainable_weightsnon_trainable_weights 也不会,因为 self.trainable 未定义.

That is, variables does not work for a DropoutWrapper instance. Neither will trainable_weights or non_trainable_weights sinceself.trainable is not defined.

再深入一点,Layer.__init__ 默认 self.trainableTrue,但 DropoutWrapper 从来不调用它.引用 Github 上的 TensorFlow 贡献者,

One step deeper, Layer.__init__ defaults self.trainable to True, but DropoutWrapper never calls it. To quote a TensorFlow contributor on Github,

DropoutWrapper 没有变量,因为它本身不存储任何变量.它包装了一个可能有变量的单元格;但是如果您访问 DropoutWrapper.variables,则不清楚语义应该是什么.例如,所有 keras 层只报告他们拥有的变量;所以只有一层拥有任何变量.也就是说,这应该返回 [],而它不返回的原因是 DropoutWrapper 从不在其构造函数中调用 super().__init__ .这应该很容易解决;欢迎 PR.

DropoutWrapper does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access the DropoutWrapper.variables. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return [], and the reason it doesn't is that DropoutWrapper never calls super().__init__ in its constructor. That should be an easy fix; PRs welcome.

例如,要访问上例中的 LSTM 变量,lstm_cell.variables 就足够了.

So for instance, to access the LSTM variables in the above example, lstm_cell.variables suffices.

据我所知,Mike Khan 的 PR 已被纳入 1.5.现在,dropout 层的 variables 属性返回一个空列表.

To the best of my knowledge, Mike Khan's PR has been incorporated into 1.5. Now, the variables property of the dropout layer returns an empty list.

这篇关于Tensorboard - 可视化 LSTM 的权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆