在同一图中构建多个模型 [英] Building multiple models in the same graph

查看:144
本文介绍了在同一图中构建多个模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建两个相似的模型来预测不同的输出类型.一个预测两个类别之间,另一个预测六个输出类别.它们的输入是相同的,并且都是LSTM RNN.

I am attempting to build two similar models predicting different output types. One predicts between two categories and the other has six output categories. Their inputs are the same and they are both LSTM RNN.

我将训练和预测分开了,在它们的每个文件model1.py和model2.py中都分成了单独的函数.

I have separated training and predicting out into separate functions in each of their files, model1.py, model2.py.

我在每个模型中给变量命名都犯了相同的错误,因此当我分别从model1和model2调用预测1和预测2时,会出现以下名称空间错误: ValueError:变量W已存在,不允许使用.您是要在VarScope中设置"reuse = True"吗?最初在以下位置定义:

I have made the mistake of naming variables in each model the same thing so that when I call predict1 and predict2 from model1 and model2 respectively I get the following name space error: ValueError: Variable W already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

其中W是权重矩阵的名称.

Where W is the name of the matrix of weights.

是否存在从同一位置运行这些预测的好方法?我试图重命名所涉及的变量,但仍然收到以下错误.似乎不可能在创建时命名lstm_cell,对吗?

Is there a good way of running these predictions from the same place? I have attempted to rename the variables involved but still get the following error. It doesn't seem that it's possible to name an lstm_cell on it's creation, is it?

ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix already exists

在预测文件中的model1pred和model2pred范围内确定后,调用model1pred()然后调用model2pred()时出现以下错误

After scoping around model1pred and model2pred in the predictions file I get the following error when calling model1pred() then model2pred()

tensorflow.python.framework.errors.NotFoundError: Tensor name model1/model1/BasicLSTMCell/Linear/Matrix" not found in checkpoint files './variables/model1.chk

该代码包含在此处.缺少model2.py中的代码,但除了n_classes = 2之外,它等效于model1.py中的代码,并且在dynamicRNN函数和内部pred中,范围设置为"model2".

The code is included here. The code in model2.py is missing but is equivalent to in model1.py except n_classes=2, and within the dynamicRNN function and inside pred the scope is set to 'model2'.

解决方案:问题是保护程序试图从第一次pred()执行中恢复包含变量的图形.我能够将pred函数的调用包装在不同的图中以解决该问题,而无需进行可变作用域.

SOLUTION: The problem was the graph which the saver was trying to restore included variables from the first pred() execution. I was able to wrap calls of pred functions in different graphs to solve the issue, removing the need to variable scoping.

在收集预测文件中:

def model1pred(test_x, test_seqlen):
    from model1 import pred
    with tf.Graph().as_default():
        return pred(test_x, test_seqlen)

def model2pred(test_x, test_seqlen):
    from model2 import pred
    with tf.Graph().as_default():
        return pred(test_x, test_seqlen)

##Import test_x, test_seqlen

probs1, preds1 = model1pred(test_x, test_seq)
probs2, cpreds2 = model2Pred(test_x, test_seq)

在model1.py

In model1.py

def dynamicRNN(x, seqlen, weights, biases):
    n_steps = 10
    n_input = 14
    n_classes = 6
    n_hidden = 100

    # Prepare data shape to match `rnn` function requirements
    # Current data input shape: (batch_size, n_steps, n_input)
    # Required shape: 'n_steps' tensors list of shape (batch_size, n_input)

    # Permuting batch_size and n_steps
    x = tf.transpose(x, [1, 0, 2])
    # Reshaping to (n_steps*batch_size, n_input)
    x = tf.reshape(x, [-1,n_input])
    # Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
    x = tf.split(0, n_steps, x)

    # Define a lstm cell with tensorflow
    lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)

    # Get lstm cell output, providing 'sequence_length' will perform dynamic calculation.
    outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen)

    # When performing dynamic calculation, we must retrieve the last
    # dynamically computed output, i.e, if a sequence length is 10, we need
    # to retrieve the 10th output.
    # However TensorFlow doesn't support advanced indexing yet, so we build
    # a custom op that for each sample in batch size, get its length and
    # get the corresponding relevant output.

    # 'outputs' is a list of output at every timestep, we pack them in a Tensor
    # and change back dimension to [batch_size, n_step, n_input]
    outputs = tf.pack(outputs)
    outputs = tf.transpose(outputs, [1, 0, 2])

    # Hack to build the indexing and retrieve the right output.
    batch_size = tf.shape(outputs)[0]
    # Start indices for each sample
    index = tf.range(0, batch_size) * n_steps + (seqlen - 1)
    # Indexing
    outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index)

    # Linear activation, using outputs computed above
    return tf.matmul(outputs, weights['out']) + biases['out']

def pred(test_x, test_seqlen):
     with tf.Session() as sess:
        n_steps = 10
        n_input = 14
        n_classes = 6
        n_hidden = 100
        weights = {'out': tf.Variable(tf.random_normal([n_hidden, n_classes]), name='W1')}
        biases = {'out': tf.Variable(tf.random_normal([n_classes]), name='b1')}
        x = tf.placeholder("float", [None, n_steps, n_input])
        y = tf.placeholder("float", [None, n_classes])
        seqlen = tf.placeholder(tf.int32, [None])

        pred = dynamicRNN(x, seqlen, weights, biases)
        saver = tf.train.Saver(tf.all_variables())
        y_p =tf.argmax(pred,1)

        init = tf.initialize_all_variables()
        sess.run(init)

        saver.restore(sess,'./variables/model1.chk')
        y_prob, y_pred= sess.run([pred, y_p], feed_dict={x: test_x, seqlen: test_seqlen})
        y_prob = np.array([softmax(x) for x in y_prob])
        return y_prob, y_pred

'

推荐答案

您可以通过添加

You can do this by adding with tf.variable_scope(): blocks around the two pieces of model construction code. This has the effect of prefixing the variables' names with a different prefix, which avoids the clash.

例如(使用您的问题中定义的model1pred()model2pred()函数):

For example (using the model1pred() and model2pred() functions defined in your question):

with tf.variable_scope('model1'):
  # Variables created in here will be named 'model1/W', etc.
  probs1, preds1 = model1pred(test_x, test_seq)

with tf.variable_scope('model2'):
  # Variables created in here will be named 'model2/W', etc.
  probs2, cpreds2 = model2Pred(test_x, test_seq)

有关更多详细信息,请参见深入的关于变量的HOWTO在TensorFlow中共享.

For more details, see the in-depth HOWTO on variable sharing in TensorFlow.

这篇关于在同一图中构建多个模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆