训练时如何冻结张量流变量中的特定节点? [英] How to freeze specific nodes in a tensorflow variable while training?

查看:100
本文介绍了训练时如何冻结张量流变量中的特定节点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我在使变量中的一些元素变得不可训练方面遇到麻烦。它意味着给定变量x,

Currently I am having trouble in making a few elements in a variable as non-trainable. It implies that given a variable such as x,

x= tf.Variable(tf.zeros([2,2]))

我希望只训练x [0,0]和x [1,1],同时保持训练时固定为x [0,1]和x [1.0]。

I wish to train only x[0,0] and x[1,1] while keeping x[0,1] ans x[1.0] as fixed while training.

当前tensorflow确实提供了通过使用 trainable = False 或<$ c使任何变量不可训练的选项$ c> tf.stop_gradient()。但是,这些方法将使 x 中的all元素不可训练。我的问题是如何获得这种选择性?

Currently tensorflow does provide the option to make any variable non-trainable by using trainable=False or tf.stop_gradient(). However, these method will make the all element in x as non-trainable. My question is how to obtain this selectivity?

推荐答案

到目前为止,还没有选择性地缺乏更新;但是,您可以通过明确地指定应更新的变量来间接实现此效果。 .minimize 和所有渐变函数都接受要优化的变量列表-只需创建一个列表,省略,例如其中的一些

There is no selective lack of update as for now; however you can achieve this effect indirectly by specifing explicitely variables that should be updated. Both .minimize and all the gradient functions accept the list of variables you want to optimize over - just create a list omitting some of these, for example

v1 = tf.Variable( ... ) # we want to freeze it in one op 
v2 = tf.Variable( ... ) # we want to freeze it in another op
v3 = tf.Variable( ... ) # we always want to train this one
loss = ...
optimizer = tf.train.GradientDescentOptimizer(0.1)

op1 = optimizer.minimize(loss, 
      var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v1])

op2 = optimizer.minimize(loss, 
      var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v2])

现在,只要您想训练wrt,就可以打电话给他们。变量子集。请注意,如果您使用Adam或其他方法来收集统计信息,则可能需要2个单独的优化器(最终每个优化器将获得单独的统计信息!)。但是,如果每次培训只有一组冻结变量,那么使用var_list可以很简单。

and now you can call them whenever you want to train wrt. subset of variables. Note that this might require 2 separate optimizers if you are using Adam or some other method gathering statistics (and you will end up with separate statistics per optimizer!). However if there is just one set of frozen variables per training - everything will be straightforward with var_list.

但是没有办法变量的子集。 Tensorflow始终将变量视为一个单位。您必须以不同的方式指定计算方式才能实现这一目标,一种方式是:

However there is no way to fix training of the subset of the variable. Tensorflow treats variable as a single unit, always. You have to specify your computations in a different way to achieve this, one way is to:


  • 创建一个二进制掩码M,其中1为要停止在X上进行更新

  • 创建单独的变量X',该变量不可训练,并将tf.assign为其值X

  • 输出X'* M +(1-M)* X

例如:

x = tf.Variable( ... )
xp= tf.Variable( ..., trainable=False)
m = tf.Constant( ... ) # mask
cp= tf.Assign(x, xp)
with tf.control_dependencies([cp]):
  x_frozen = m*xp + (1-m)*x

,您只需使用x_frozen而不是x。请注意,我们需要控制依赖项,因为tf.assign可以异步执行,这里我们要确保它始终具有最新的x值。

and you just use x_frozen instead of x. Note that we need control dependency as tf.assign can execute asynchronously, and here we want to make sure it always has the most up to date value of x.

这篇关于训练时如何冻结张量流变量中的特定节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆