GRU 相同的配置但以两种不同的方式在 tensorflow 中产生两种不同的输出 [英] GRU same configurations but in two different ways produces two different output in tensorflow

查看：62 发布时间：2021/7/5 18:59:57 python tensorflow rnn

本文介绍了GRU 相同的配置但以两种不同的方式在 tensorflow 中产生两种不同的输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用 GRU 在 tensorflow 中做一些序列预测.所以我用两种不同的方式创建了相同的模型，如下所示:

I would like to do some sequence prediction in tensorflow using GRU. so I have created the same model in 2 different ways as follows:

在模型 1 中，我有 2 个 GRU，一个接一个，即 new_state1，第一个 GRU 的最终隐藏状态，作为第二个 GRU 的初始状态.因此，模型依次输出 new_state1 和 new_state2.请注意，这不是 2 层模型，而是只有 1 层.从下面的代码中，我将输入和输出分为两部分，其中 GRU1 占第一部分，第二个 GRU 占第二部分.

In model 1 I have a 2 GRUs, one after the other, that is, the new_state1, the final hidden state of the first GRU, acts as the initial state to the second GRU. Therefore, the model outputs new_state1 and new_state2 consequentially. Note that this is not a 2 layer model, but only 1 layer. From the code below, I divided the input and the output into 2 parts where GRU1 takes the first part, and the second GRU takes the second part.

此外，两个模型的 random_seed 都已设置并固定，以便结果具有可比性.

Also the random_seed is set and fixed for both model so that results can be comparable.

import tensorflow as tf
import numpy as np

cell_size = 32

seq_length = 1000

time_steps1 = 500
time_steps2 = seq_length - time_steps1

x_t = np.arange(1, seq_length + 1)    
x_t_plus_1 = np.arange(2, seq_length + 2)

tf.set_random_seed(123)

m_dtype = tf.float32

input_1 = tf.placeholder(dtype=m_dtype, shape=[None, time_steps1, 1], name="input_1")
input_2 = tf.placeholder(dtype=m_dtype, shape=[None, time_steps2, 1], name="input_2")

labels1 = tf.placeholder(dtype=m_dtype, shape=[None, time_steps1, 1], name="labels_1")
labels2 = tf.placeholder(dtype=m_dtype, shape=[None, time_steps2, 1], name="labels_2")

labels = tf.concat([labels1, labels2], axis=1, name="labels")

initial_state = tf.placeholder(shape=[None, cell_size], dtype=m_dtype, name="initial_state")

def model(input_feat1, input_feat2):
    with tf.variable_scope("GRU"):
        cell1 = tf.nn.rnn_cell.GRUCell(cell_size)
        cell2 = tf.nn.rnn_cell.GRUCell(cell_size)

        with tf.variable_scope("First50"):
            # output1: shape=[1, time_steps1, 32]
            output1, new_state1 = tf.nn.dynamic_rnn(cell1, input_feat1, dtype=m_dtype, initial_state=initial_state)

        with tf.variable_scope("Second50"):
            # output2: shape=[1, time_steps2, 32]
            output2, new_state2 = tf.nn.dynamic_rnn(cell2, input_feat2, dtype=m_dtype, initial_state=new_state1)

        with tf.variable_scope("output"):
            # output shape: [1, time_steps1 + time_steps2, 32] => [1, 100, 32]
            output = tf.concat([output1, output2], axis=1)

            output = tf.reshape(output, shape=[-1, cell_size])
            output = tf.layers.dense(output, units=1)
            output = tf.reshape(output, shape=[1, time_steps1 + time_steps2, 1])

        with tf.variable_scope("outputs_1_2_reshaped"):
            output1 = tf.slice(input_=output, begin=[0, 0, 0], size=[-1, time_steps1, -1])
            output2 = tf.slice(input_=output, begin=[0, time_steps1, 0], size=[-1, time_steps2, 1])

            print(output.get_shape().as_list(), "1")
            print(output1.get_shape().as_list(), "2")
            print(output2.get_shape().as_list(), "3")

            return output, output1, output2, initial_state, new_state1, new_state2

output, output1, output2, initial_state, new_state1, new_state2 = model(input_1, input_2)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    to_run_list = [new_state1, new_state2]

    in1 = np.reshape(x_t[:time_steps1], newshape=(1, time_steps1, 1))
    in2 = np.reshape(x_t[time_steps1:], newshape=(1, time_steps2, 1))
    l1 = np.reshape(x_t_plus_1[:time_steps1], newshape=(1, time_steps1, 1))
    l2 = np.reshape(x_t_plus_1[time_steps1:], newshape=(1, time_steps2, 1))
    i_s = np.zeros([1, cell_size])

    new_s1, new_s2 = sess.run(to_run_list, feed_dict={input_1: in1,
                                                              input_2: in2,
                                                              labels1: l1,
                                                              labels2: l2,
                                                              initial_state: i_s})

    print(np.shape(new_s1), np.shape(new_s2))

    print(np.mean(new_s1), np.mean(new_s2))
    print(np.sum(new_s1), np.sum(new_s2))

在这个模型中，我没有使用 2 个不同的 GRU，而是创建了一个，我将输入和标签也分成了 2 个不同的部分，并使用了 for 循环来迭代我的输入数据集.然后获取最终状态并将其反馈到与初始状态相同的模型中.

In this model, Instead of having 2 different GRU, I created one, and I divided the input and labels into 2 different parts as well, and I used a for loop to iterate over my input dataset. Then the final state is taken and fed back into the same model as initial state.

请注意，模型 1 和模型 2 的第一个初始状态都是零.

Note that both model1 and model2 have the very first initial state of zeros.

import tensorflow as tf
import numpy as np

cell_size = 32

seq_length = 1000

time_steps = 500

x_t = np.arange(1, seq_length + 1)    
x_t_plus_1 = np.arange(2, seq_length + 2)

tf.set_random_seed(123)

m_dtype = tf.float32

inputs = tf.placeholder(dtype=m_dtype, shape=[None, time_steps, 1], name="inputs")

labels = tf.placeholder(dtype=m_dtype, shape=[None, time_steps, 1], name="labels")

initial_state = tf.placeholder(shape=[None, cell_size], dtype=m_dtype, name="initial_state")

grads_initial_state = tf.placeholder(dtype=m_dtype, shape=[None, cell_size], name="prev_grads")

this_is_last_batch = tf.placeholder(dtype=tf.bool, name="this_is_last_batch")

def model(input_feat):
    with tf.variable_scope("GRU"):
        cell = tf.nn.rnn_cell.GRUCell(cell_size)

        with tf.variable_scope("cell"):
            # output1: shape=[1, time_steps, 32]
            output, new_state = tf.nn.dynamic_rnn(cell, input_feat, dtype=m_dtype, initial_state=initial_state)

        with tf.variable_scope("output"):

            output = tf.reshape(output, shape=[-1, cell_size])
            output = tf.layers.dense(output, units=1)
            output = tf.reshape(output, shape=[1, time_steps, 1])

            print(output.get_shape().as_list(), "1")

            return output, new_state

output, new_state = model(inputs)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    # 1000 // 500 = 2
    num_iterations = seq_length // time_steps
    print("num_iterations:", num_iterations)

    final_states = []
    to_run_list = [grads_wrt_initial_state, new_state]

    for i in range(num_iterations):

        current_xt = x_t[i * time_steps: (i + 1)*time_steps]
        current_xt_plus_1 = x_t_plus_1[i*time_steps: (i + 1)*time_steps]

        in1 = np.reshape(current_xt, newshape=(1, time_steps, 1))
        l1 = np.reshape(current_xt_plus_1, newshape=(1, time_steps, 1))
        i_s = np.zeros([1, cell_size])

        if i == 0:
            new_s = sess.run(new_state, feed_dict={inputs: in1,
                                                   labels: l1,
                                                   initial_state: i_s})
            final_states.append(new_s)
            print("---->", np.mean(final_states[-1]), np.sum(final_states[-1]), i)
        else:
            new_s = sess.run(new_state, feed_dict={inputs: in1,
                                                   labels: l1,
                                                   initial_state: final_states[-1]})
            final_states.append(new_s)
            print("---->", np.mean(final_states[-1]), np.sum(final_states[-1]), i)

最后，在打印出model1中new_state1和new_state2的统计信息后，它们与new_state不同，每次迭代后，在模型2.

Finally, after printing out the statistics of new_state1 and new_state2 in model1, they were different from the new_state, after each iteration, in model2.

我想知道如何解决这个问题以及为什么会这样.

I would like to know how to fix this problem and why is that happening.

我发现两个文件中gru的权重值是不同的

现在如何在设置随机种子后在 2 个不同的文件中重现相同的结果?

Now how can I reproduce the same results in 2 the different files even after setting the random seed?

非常感谢任何帮助！！！

Any help is much appreciated!!!

GRU 相同的配置但以两种不同的方式在 tensorflow 中产生两种不同的输出 [英] GRU same configurations but in two different ways produces two different output in tensorflow

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

GRU 相同的配置但以两种不同的方式在 tensorflow 中产生两种不同的输出 [英] GRU same configurations but in two different ways produces two different output in tensorflow

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭