“冻结"tensorflow 中的一些变量/范围:stop_gradient 与传递变量以最小化 [英] "freeze" some variables/scopes in tensorflow: stop_gradient vs passing variables to minimize

查看:26
本文介绍了“冻结"tensorflow 中的一些变量/范围:stop_gradient 与传递变量以最小化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实施 对抗性神经网络,这需要在交替训练期间冻结"图形的一个或另一部分小批量.IE.有两个子网:G 和 D.

I am trying to implement Adversarial NN, which requires to 'freeze' one or the other part of the graph during alternating training minibatches. I.e. there two sub-networks: G and D.

G( Z ) ->  Xz
D( X ) ->  Y

其中G的损失函数依赖于D[G(Z)], D[X].

where loss function of G depends on D[G(Z)], D[X].

首先我需要在所有G参数固定的情况下训练D中的参数,然后在D中固定参数的情况下训练G中的参数.第一种情况下的损失函数在第二种情况下将是负损失函数,并且更新必须适用于第一或第二子网的参数.

First I need to train parameters in D with all G parameters fixed, and then parameters in G with parameters in D fixed. Loss function in first case will be negative loss function in the second case and the update will have to apply to the parameters of whether first or second subnetwork.

我看到 tensorflow 有 tf.stop_gradient 函数.为了训练 D(下游)子网络,我可以使用这个函数来阻止梯度流到

I saw that tensorflow has tf.stop_gradient function. For purpose of training the D (downstream) subnetwork I can use this function to block the gradient flow to

 Z -> [ G ] -> tf.stop_gradient(Xz) -> [ D ] -> Y

tf.stop_gradient 注释非常简洁,没有内嵌示例(示例 seq2seq.py 太长且不易阅读),但看起来就像它必须在图形创建期间调用一样.这是否意味着如果我想在交替批次中阻塞/解除阻塞梯度流,我需要重新创建和重新初始化图形模型?

The tf.stop_gradient is very succinctly annotated with no in-line example (and example seq2seq.py is too long and not that easy to read), but looks like it must be called during the graph creation. Does it imply that if I want to block/unblock gradient flow in alternating batches, I need to re-create and re-initialize the graph model?

也好像不能通过tf.stop_gradient来阻断流经G(上游)网络的梯度吧?

Also it seems that one cannot block the gradient flowing through the G (upstream) network by means of tf.stop_gradient, right?

作为替代,我看到可以将变量列表作为 opt_op = opt.minimize(cost, ) 传递给优化器调用,这将是一个简单的解决方案如果可以获取每个子网范围内的所有变量.可以获取 tf.scope 的 <变量列表> 吗?

As an alternative I saw that one can pass the list of variables to the optimizer call as opt_op = opt.minimize(cost, <list of variables>), which would be an easy solution if one could get all variables in the scopes of each subnetwork. Can one get a <list of variables> for a tf.scope?

推荐答案

正如您在问题中提到的,实现此目标的最简单方法是使用对 opt.minimize(cost,...).默认情况下,优化器将使用 tf.trainable_variables() 中的所有变量.如果要将变量过滤到特定范围,可以将可选的 scope 参数用于 tf.get_collection() 如下:

The easiest way to achieve this, as you mention in your question, is to create two optimizer operations using separate calls to opt.minimize(cost, ...). By default, the optimizer will use all of the variables in tf.trainable_variables(). If you want to filter the variables to a particular scope, you can use the optional scope argument to tf.get_collection() as follows:

optimizer = tf.train.AdagradOptimzer(0.01)

first_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                     "scope/prefix/for/first/vars")
first_train_op = optimizer.minimize(cost, var_list=first_train_vars)

second_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                      "scope/prefix/for/second/vars")                     
second_train_op = optimizer.minimize(cost, var_list=second_train_vars)

这篇关于“冻结"tensorflow 中的一些变量/范围:stop_gradient 与传递变量以最小化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆