关于 tensorflow 中变量作用域的名称 [英] About names of variable scope in tensorflow

查看:24
本文介绍了关于 tensorflow 中变量作用域的名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我一直在尝试学习使用 TensorFlow,但我不明白变量作用域究竟是如何工作的.特别是,我有以下问题:

Recently I have been trying to learn to use TensorFlow, and I do not understand how variable scopes work exactly. In particular, I have the following problem:

import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import rnn

inputs = [tf.placeholder(tf.float32,shape=[10,10]) for _ in range(5)]
cell = rnn_cell.BasicLSTMCell(10)
outpts, states = rnn.rnn(cell, inputs, dtype=tf.float32)

print outpts[2].name
# ==> u'RNN/BasicLSTMCell_2/mul_2:0'

'BasicLSTMCell_2' 中的 '_2' 来自哪里?稍后使用 tf.get_variable(reuse=True) 再次获取相同的变量时它是如何工作的?

Where does the '_2' in 'BasicLSTMCell_2' come from? How does it work when later using tf.get_variable(reuse=True) to get the same variable again?

我想我发现了一个相关的问题:

edit: I think I find a related problem:

def creating(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.get_variable(s,[1])
    return a

def creating_mod(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.Variable(0.0, name=s)
    return a

tf.ops.reset_default_graph()
a=creating('a')
b=creating_mod('b')
c=creating('c')
d=creating_mod('d')

print a.name, '\n', b.name,'\n', c.name,'\n', d.name

输出是

test/inner/a:0 
test_1/inner/b:0 
test/inner/c:0 
test_3/inner/d:0

我很困惑...

推荐答案

"BasicLSTMCell_2" 中的 "_2"名称范围,其中创建了操作 outpts[2].每次创建新的名称范围时(使用 tf.name_scope()) 或变量范围(使用 tf.variable_scope()) 根据给定的字符串,将唯一的后缀添加到当前名称范围,可能还带有附加后缀以使其唯一.对 rnn.rnn(...) 的调用具有以下伪代码(为了清晰起见,已简化并使用公共 API 方法):

The "_2" in "BasicLSTMCell_2" relates to the name scope in which the op outpts[2] was created. Every time you create a new name scope (with tf.name_scope()) or variable scope (with tf.variable_scope()) a unique suffix is added to the current name scope, based on the given string, possibly with an additional suffix to make it unique. The call to rnn.rnn(...) has the following pseudocode (simplified and using public API methods for clarity):

outputs = []
with tf.variable_scope("RNN"):
  for timestep, input_t in enumerate(inputs):
    if timestep > 0:
      tf.get_variable_scope().reuse_variables()
    with tf.variable_scope("BasicLSTMCell"):
      outputs.append(...)
return outputs

如果您查看 outpts 中张量的名称,您会发现它们如下所示:

If you look at the names of the tensors in outpts, you'll see that they are the following:

>>> print [o.name for o in outpts]
[u'RNN/BasicLSTMCell/mul_2:0',
 u'RNN/BasicLSTMCell_1/mul_2:0',
 u'RNN/BasicLSTMCell_2/mul_2:0',
 u'RNN/BasicLSTMCell_3/mul_2:0',
 u'RNN/BasicLSTMCell_4/mul_2:0']

当您输入新的名称范围时(通过输入 with tf.name_scope("..."):with tf.variable_scope("..."): 块),TensorFlow 创建一个新的唯一名称为范围.第一次输入 "BasicLSTMCell" 范围时,TensorFlow 逐字使用该名称,因为它之前没有使用过(在 "RNN/" 范围内).下一次,TensorFlow 将 "_1" 附加到范围以使其唯一,依此类推直到 "RNN/BasicLSTMCell_4".

When you enter a new name scope (by entering a with tf.name_scope("..."): or with tf.variable_scope("..."): block), TensorFlow creates a new, unique name for the scope. The first time the "BasicLSTMCell" scope is entered, TensorFlow uses that name verbatim, because it hasn't been used before (in the "RNN/" scope). The next time, TensorFlow appends "_1" to the scope to make it unique, and so on up to "RNN/BasicLSTMCell_4".

变量作用域和名称作用域的主要区别在于变量作用域还有一组name-to-tf.Variable 绑定.通过调用tf.get_variable_scope().reuse_variables(),我们指示TensorFlow重用而不是为"RNN/"范围创建变量(及其子节点),在时间步长 0 之后.这确保权重在多个 RNN 单元之间正确共享.

The main difference between variable scopes and name scopes is that a variable scope also has a set of name-to-tf.Variable bindings. By calling tf.get_variable_scope().reuse_variables(), we instruct TensorFlow to reuse rather than create variables for the "RNN/" scope (and its children), after timestep 0. This ensures that the weights are correctly shared between the multiple RNN cells.

这篇关于关于 tensorflow 中变量作用域的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆