在Tensorflow中重用LSTM的重用变量 [英] Reuse Reusing Variable of LSTM in Tensorflow

查看:90
本文介绍了在Tensorflow中重用LSTM的重用变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用LSTM制作RNN. 我制作了LSTM模型,然后有两个DNN网络和一个回归输出层.

我训练了我的数据,最终训练损失约为0.009. 但是,当我将模型应用于测试数据时,损失大约为0.5.

第1个时期的训练损失约为0.5. 因此,我认为训练后的变量未在测试模型中使用.

训练模型和测试模型之间的唯一区别是批次大小. Trainning Batch = 100~200Test Batch Size = 1.

在主要功能中,

我创建了LSTM实例. 在LSTM硝化机中,建立模型.

def __init__(self,config,train_model=None):
    self.sess = sess = tf.Session()

    self.num_steps = num_steps = config.num_steps
    self.lstm_size = lstm_size = config.lstm_size
    self.num_features = num_features = config.num_features
    self.num_layers = num_layers = config.num_layers
    self.num_hiddens = num_hiddens = config.num_hiddens
    self.batch_size = batch_size = config.batch_size
    self.train = train = config.train
    self.epoch = config.epoch
    self.learning_rate = learning_rate = config.learning_rate

    with tf.variable_scope('model') as scope:        
        self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))
        self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)

    with tf.name_scope('placeholders'):
        self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],
                                name='input-x')
        self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')
        self.init_state = cell.zero_state(self.batch_size,tf.float32)
    with tf.variable_scope('model'):
        self.W1 = tf.Variable(tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')
        self.b1 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')
        self.W2 = tf.Variable(tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')
        self.b2 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')
        self.W3 = tf.Variable(tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')
        self.b3 = tf.Variable(tf.truncated_normal([num_features],stddev=0.1),name='b3')


    self.output, self.loss = self.inference()
    tf.initialize_all_variables().run(session=sess)                
    tf.initialize_variables([self.b2]).run(session=sess)

    if train_model == None:
        self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

使用高于LSTM init 的方法,制作低于LSTM实例的

.

with tf.variable_scope("model",reuse=None):
    train_model = LSTM(main_config)
with tf.variable_scope("model", reuse=True):
    predict_model = LSTM(predict_config)

创建两个LSTM实例后,我训练了train_model. 然后在predict_model中输入测试集.

为什么不重用该变量?

解决方案

问题是,如果您要重用scope,则应该使用tf.get_variable()而不是tf.Variable()来创建变量.

在本教程中 变量,您会更好地理解它.

此外,您无需在此处使用会话,因为在定义模型时不必初始化变量,因此在训练模型时应初始化变量.

重用变量的代码如下:

def __init__(self,config,train_model=None):
    self.num_steps = num_steps = config.num_steps
    self.lstm_size = lstm_size = config.lstm_size
    self.num_features = num_features = config.num_features
    self.num_layers = num_layers = config.num_layers
    self.num_hiddens = num_hiddens = config.num_hiddens
    self.batch_size = batch_size = config.batch_size
    self.train = train = config.train
    self.epoch = config.epoch
    self.learning_rate = learning_rate = config.learning_rate

    with tf.variable_scope('model') as scope:        
        self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))
        self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)

    with tf.name_scope('placeholders'):
        self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],
                                name='input-x')
        self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')
        self.init_state = cell.zero_state(self.batch_size,tf.float32)
    with tf.variable_scope('model'):
        self.W1 = tf.get_variable(initializer=tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')
        self.b1 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')
        self.W2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')
        self.b2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')
        self.W3 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')
        self.b3 = tf.get_variable(initializer=tf.truncated_normal([num_features],stddev=0.1),name='b3')


    self.output, self.loss = self.inference()

    if train_model == None:
        self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

要查看在创建train_modelpredict_model之后创建了哪些变量,请使用以下代码:

for v in tf.all_variables():
    print(v.name)

I'm trying to make RNN using LSTM. I made LSTM model, and after it, there is two DNN network, and one regression output layer.

I trained my data, and the final training loss become about 0.009. However, when i applied the model to test data, the loss become about 0.5.

The 1th epoch training loss is about 0.5. So, I think the trained variable do not used in test model.

The only difference between training and test model is batch size. Trainning Batch = 100~200, Test Batch Size = 1.

in main function i made LSTM instance. In LSTM innitializer, the model is made.

def __init__(self,config,train_model=None):
    self.sess = sess = tf.Session()

    self.num_steps = num_steps = config.num_steps
    self.lstm_size = lstm_size = config.lstm_size
    self.num_features = num_features = config.num_features
    self.num_layers = num_layers = config.num_layers
    self.num_hiddens = num_hiddens = config.num_hiddens
    self.batch_size = batch_size = config.batch_size
    self.train = train = config.train
    self.epoch = config.epoch
    self.learning_rate = learning_rate = config.learning_rate

    with tf.variable_scope('model') as scope:        
        self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))
        self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)

    with tf.name_scope('placeholders'):
        self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],
                                name='input-x')
        self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')
        self.init_state = cell.zero_state(self.batch_size,tf.float32)
    with tf.variable_scope('model'):
        self.W1 = tf.Variable(tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')
        self.b1 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')
        self.W2 = tf.Variable(tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')
        self.b2 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')
        self.W3 = tf.Variable(tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')
        self.b3 = tf.Variable(tf.truncated_normal([num_features],stddev=0.1),name='b3')


    self.output, self.loss = self.inference()
    tf.initialize_all_variables().run(session=sess)                
    tf.initialize_variables([self.b2]).run(session=sess)

    if train_model == None:
        self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

Using Above LSTM init, below LSTM instance are made.

with tf.variable_scope("model",reuse=None):
    train_model = LSTM(main_config)
with tf.variable_scope("model", reuse=True):
    predict_model = LSTM(predict_config)

after making two LSTM instance, I trained the train_model. And I input the test set in predict_model.

Why the variable are not reused?

解决方案

The problem is that you should be using tf.get_variable() to create your variables, instead of tf.Variable(), if you are reusing a scope.

Take a look at this tutorial for sharing variables, you'll understand it better.

Also, you don't need to use a session here, because you don't have to initialize your variables when you are defining the model, the variables should be initialized when you are about to train your model.

The code to reuse the variables is the following:

def __init__(self,config,train_model=None):
    self.num_steps = num_steps = config.num_steps
    self.lstm_size = lstm_size = config.lstm_size
    self.num_features = num_features = config.num_features
    self.num_layers = num_layers = config.num_layers
    self.num_hiddens = num_hiddens = config.num_hiddens
    self.batch_size = batch_size = config.batch_size
    self.train = train = config.train
    self.epoch = config.epoch
    self.learning_rate = learning_rate = config.learning_rate

    with tf.variable_scope('model') as scope:        
        self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))
        self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)

    with tf.name_scope('placeholders'):
        self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],
                                name='input-x')
        self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')
        self.init_state = cell.zero_state(self.batch_size,tf.float32)
    with tf.variable_scope('model'):
        self.W1 = tf.get_variable(initializer=tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')
        self.b1 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')
        self.W2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')
        self.b2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')
        self.W3 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')
        self.b3 = tf.get_variable(initializer=tf.truncated_normal([num_features],stddev=0.1),name='b3')


    self.output, self.loss = self.inference()

    if train_model == None:
        self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

To see which variables are created after you create train_model and predict_model use the following code:

for v in tf.all_variables():
    print(v.name)

这篇关于在Tensorflow中重用LSTM的重用变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆