BasicRNNCell 中的内部变量 [英] internal variables in BasicRNNCell

查看:46
本文介绍了BasicRNNCell 中的内部变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下示例代码来测试 BasicRNNCell.我想得到它的内部矩阵,以便我可以使用我自己的代码计算 output_resnewstate_res 的值,以确保我可以重现 output_resnewstate_res.

I have the following example code to test BasicRNNCell. I'd like to get its internal matrix so that I can calculate the values of output_res, newstate_res using my own code to make sure that I can reproduce the values of output_res, newstate_res.

在 tensorflow 源代码中,它说 output = new_state = act(W * input + U * state + B).有人知道我如何获得 WU 吗?(我尝试访问 cell._kernel,但它不可用.)

In tensorflow source code, it says output = new_state = act(W * input + U * state + B). Does anybody know how I can get W and U? (I tried to access cell._kernel, but it is not available.)

$ cat ./main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import tensorflow as tf
import numpy as np

batch_size = 4
vector_size = 3

inputs = tf.placeholder(
        tf.float32
        , [batch_size, vector_size]
        )

num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)

cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs = inputs, state = state)

X = np.zeros([batch_size, vector_size])
#X = np.ones([batch_size, vector_size])
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    output_res, newstate_res = sess.run([output, newstate], feed_dict = {inputs: X})
    print(output_res)
    print(newstate_res)
sess.close()

$ ./main.py
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]

推荐答案

简短回答:您认识到您在追求 cell._kernel.以下是使用 variables 属性获取内核(和偏差)的一些代码,大多数 TensorFlow RNN 中都有该属性:

Short answer: You recognize you're after cell._kernel. Here's some code to get kernel (and bias) using the variables property, which is in most TensorFlow RNNs:

import tensorflow as tf
import numpy as np

batch_size = 4
vector_size = 3
inputs = tf.placeholder(tf.float32, [batch_size, vector_size])

num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)

cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs=inputs, state=state)

print("Output of cell.variables is a list of Tensors:")
print(cell.variables)
kernel, bias = cell.variables

X = np.zeros([batch_size, vector_size])
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    output_, newstate_, k_, b_ = sess.run(
        [output, newstate, kernel, bias], feed_dict = {inputs: X})
    print("Output:")
    print(output_)
    print("New State == Output:")
    print(newstate_)
    print("\nKernel:")
    print(k_)
    print("\nBias:")
    print(b_)

输出

Output of cell.variables is a list of Tensors:
[<tf.Variable 'basic_rnn_cell/kernel:0' shape=(5, 2) dtype=float32_ref>, 
<tf.Variable 'basic_rnn_cell/bias:0' shape=(2,) dtype=float32_ref>]
Output:
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]
New State == Output:
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]

Kernel:
[[ 0.41417515 -0.64997244]
 [-0.40868729 -0.90995187]
 [ 0.62134564 -0.88962835]
 [-0.35878009 -0.25680023]
 [ 0.35606658 -0.83596271]]

Bias:
[ 0.  0.]

长答案:你还问了W和U怎么得到,我把call的实现复制一下,讨论一下W和U在哪里.

Long answer: You also ask about how to get W and U. Let me copy the implementation of call and discuss where W and U are.

def call(self, inputs, state):
     """Most basic RNN: output = new_state = act(W * input + U * state + B)."""

    gate_inputs = math_ops.matmul(
        array_ops.concat([inputs, state], 1), self._kernel)
    gate_inputs = nn_ops.bias_add(gate_inputs, self._bias)
    output = self._activation(gate_inputs)
    return output, output

看起来不像有 W 和 U,但它们就在那里.本质上,内核的前 vector_size 行是 W,内核的下一个 num_units 行是 U.也许看看 LaTeX 中的逐元素数学会有所帮助:

Doesn't look like there's a W and a U, but they are there. Essentially, the first vector_size rows of the kernel are W and the next num_units rows of the kernel are U. Maybe it's helpful to see the element-wise math in LaTeX:

我使用 m 作为通用批量索引,v 作为 vector_sizen 作为 num_unitsb 作为 batch_size.还有 [ ;] 表示连接.由于 TensorFlow 以批处理为主,因此实现通常使用右乘矩阵.

I'm using m to be a generic batch index, v as vector_size, n as num_units, and b as batch_size. Also [ ; ] denotes concatenation. Since TensorFlow is batch-major, implementations usually use right-multiply matrices.

由于这是一个非常基本的 RNN,output == new_state.下一次迭代的历史"只是当前迭代的输出.

And since this is a very basic RNN, output == new_state. The "history" for the next iteration is simply the output of the current iteration.

这篇关于BasicRNNCell 中的内部变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆