Tensorboard - 可视化 LSTM 的权重 [英] Tensorboard - visualize weights of LSTM
问题描述
我正在使用几个 LSTM 层来形成一个深度循环神经网络.我想在训练期间监控每个 LSTM 层的权重.但是,我不知道如何将 LSTM 层权重的摘要附加到 TensorBoard.
I am using several LSTM layers to form a deep recurrent neural network. I would like to monitor the weights of each LSTM layer during training. However, I couldn't find out how to attach summaries of the LSTM layer weights to TensorBoard.
关于如何做到这一点有什么建议吗?
Any suggestions on how this can be done?
代码如下:
cells = []
with tf.name_scope("cell_1"):
cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
input_keep_prob=self.input_dropout,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell1)
with tf.name_scope("cell_2"):
cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell2)
with tf.name_scope("cell_3"):
cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
# cell has no input dropout since previous cell already has output dropout
cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell3)
cell = tf.contrib.rnn.MultiRNNCell(
cells, state_is_tuple=True)
output, self.final_state = tf.nn.dynamic_rnn(
cell,
inputs=self.inputs,
initial_state=self.init_state)
推荐答案
tf.contrib.rnn.LSTMCell
对象有一个 property 称为 variables
对此有效.只有一个技巧:该属性返回一个空列表,直到您的单元格通过 tf.nn.dynamic_rnn
.至少在使用单个 LSTMCell 时是这种情况.我不能说 MultiRNNCell
.所以我希望这会奏效:
tf.contrib.rnn.LSTMCell
objects have a property called variables
that works for this. There's just one trick: The property returns an empty list until your cell goes through tf.nn.dynamic_rnn
. At least this is the case when using a single LSTMCell. I can't speak for MultiRNNCell
. So I expect this would work:
output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
one_kernel, one_bias = one_lstm_cell.variables
# I think TensorBoard handles summaries with the same name fine.
tf.summary.histogram("Kernel", one_kernel)
tf.summary.histogram("Bias", one_bias)
然后你可能知道怎么做,但是
And then you probably know how to do it from there, but
summary_op = tf.summary.merge_all()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
train_writer = tf.summary.FileWriter(
"my/preferred/logdir/train", graph=tf.get_default_graph())
for step in range(1, training_steps+1):
...
_, step_summary = sess.run([train_op, summary_op])
train_writer.add_summary(step_summary)
查看我上面链接的 TensorFlow 文档,还有一个 weights
属性.我不知道有什么区别,如果有的话.而且,variables
返回的顺序没有记录.我通过打印结果列表并查看变量名称来解决这个问题.
Looking at the TensorFlow documentation I linked above, there's also a weights
property. I don't know the difference, if there is any. And, the order of the variables
return isn't documented. I figured it out by printing the resulting list and looking at the variable names.
现在,根据 doc 它说它返回所有层变量.老实说,我不知道 MultiRNNCell
是如何工作的,所以我不能告诉你这些变量是否只属于 MultiRNNCell
或者它是否包含来自进入它的单元格的变量.无论哪种方式,知道属性存在应该是一个很好的提示!希望这会有所帮助.
Now, MultiRNNCell
has the same variables
property according to its doc and it says it returns all layer variables. I honestly don't know how MultiRNNCell
works, so I cannot tell you whether these are variables belonging exclusively to MultiRNNCell
or if it includes variables from the cells that go into it. Either way, knowing the property exists should be a nice tip! Hope this helps.
尽管 variables
已针对大多数(所有?)RNN 类进行了记录,但对于 DropoutWrapper
来说确实会中断.属性已记录从 r1.2 开始,但在 1.2 和 1.4 中访问该属性会导致异常(看起来像 1.3,但未经测试).具体来说,
Although variables
is documented for most (all?) RNN classes, it does break for DropoutWrapper
. The property has been documented since r1.2, but accessing the property causes an exception in 1.2 and 1.4 (and looks like 1.3, but untested). Specifically,
from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)
将抛出 AttributeError: 'DropoutWrapper' object has no attribute 'trainable'
.从回溯(或长时间盯着 DropoutWrapper 源),我注意到 variables
是在 DropoutWrapper 的超级RNNCell
的超级层
.晕了吗?事实上,我们在这里找到了记录的 variables
属性.它返回(记录的)weights
属性.weights
属性返回(记录的)self.trainable_weights + self.non_trainable_weights
属性.最后是问题的根源:
will throw AttributeError: 'DropoutWrapper' object has no attribute 'trainable'
. From the traceback (or a long stare at the DropoutWrapper source), I noticed that variables
is implemented in DropoutWrapper's super RNNCell
's super Layer
. Dizzy yet? Indeed, we find the documented variables
property here. It returns the (documented) weights
property. The weights
property returns the (documented) self.trainable_weights + self.non_trainable_weights
properties. And finally the root of the problem:
@property
def trainable_weights(self):
return self._trainable_weights if self.trainable else []
@property
def non_trainable_weights(self):
if self.trainable:
return self._non_trainable_weights
else:
return self._trainable_weights + self._non_trainable_weights
也就是说,variables
不适用于 DropoutWrapper
实例.trainable_weights
或 non_trainable_weights
也不会,因为 self.trainable
未定义.
That is, variables
does not work for a DropoutWrapper
instance. Neither will trainable_weights
or non_trainable_weights
sinceself.trainable
is not defined.
再深入一点,Layer.__init__
默认 self.trainable
为 True
,但 DropoutWrapper
从来不调用它.引用 Github 上的 TensorFlow 贡献者,
One step deeper, Layer.__init__
defaults self.trainable
to True
, but DropoutWrapper
never calls it. To quote a TensorFlow contributor on Github,
DropoutWrapper
没有变量,因为它本身不存储任何变量.它包装了一个可能有变量的单元格;但是如果您访问 DropoutWrapper.variables
,则不清楚语义应该是什么.例如,所有 keras 层只报告他们拥有的变量;所以只有一层拥有任何变量.也就是说,这应该返回 []
,而它不返回的原因是 DropoutWrapper 从不在其构造函数中调用 super().__init__
.这应该很容易解决;欢迎 PR.
DropoutWrapper
does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access theDropoutWrapper.variables
. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return[]
, and the reason it doesn't is that DropoutWrapper never callssuper().__init__
in its constructor. That should be an easy fix; PRs welcome.
例如,要访问上例中的 LSTM 变量,lstm_cell.variables
就足够了.
So for instance, to access the LSTM variables in the above example, lstm_cell.variables
suffices.
据我所知,Mike Khan 的 PR 已被纳入 1.5.现在,dropout 层的 variables 属性返回一个空列表.
To the best of my knowledge, Mike Khan's PR has been incorporated into 1.5. Now, the variables property of the dropout layer returns an empty list.
这篇关于Tensorboard - 可视化 LSTM 的权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!