有没有办法使用Tensorflow自动进行迁移学习? [英] Is there a way to automate transfer learning with Tensorflow?

查看:185
本文介绍了有没有办法使用Tensorflow自动进行迁移学习?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Tensorflow来构建和训练多个神经网络.这些网络正在对相关任务(自然语言处理)进行监督学习.

我所有神经网络之间的共同点是它们共享一些早期层(有些共享另外2个).

我希望能够从一种架构中共享经过训练的公共层权重,以初始化另一种架构.

此刻我做事的方式是,每当我要传递权重时,我都会编写一个单独的(即席)代码.这使我的项目混乱,而且很费时间.

有人知道有什么方法可以使我自动进行体重转移.例如,假设要自动检测公共图层,则初始化相应的权重.

解决方案

您可以创建

这应该允许您在一个图形中调用save并在另一个图形中调用restore来传递权重.

如果您想避免将任何内容写入磁盘,那么我认为除了手动复制/粘贴值外没有其他任何东西.但是,也可以通过使用集合和完全相同的构建过程在一定程度上实现自动化:

model1_graph = create_model1()
model2_graph = create_model2()

with model1_graph.as_default(), tf.Session() as sess:
    # Train...
    # Retrieve learned weights
    transferable_weights = sess.run(tf.get_collection(TRANSFERABLE_VARIABLES))

with model2_graph.as_default(), tf.Session() as sess:
    # Load weights from the other model
    for var, weight in zip(tf.get_collection(TRANSFERABLE_VARIABLES),
                           transferable_weights):
        var.load(weight, sess)
    # Continue training...

同样,这仅在公共层的构造相同的情况下有效,因为两个图中集合中变量的顺序应相同.

更新:

如果要确保恢复的变量不用于训练,则有几种可能性,尽管它们可能都需要对代码进行更多更改. trainable变量只是集合 tf.GrapKeys.TRAINABLE_VARIABLES中包含的变量. ,因此当您在第二张图中创建传输的变量时,您只需说trainable=False,恢复过程就可以正常进行.如果要提高动态性并自动进行操作,则或多或少有可能,但是请记住这一点:必须在创建优化程序之前 知道必须用于训练的变量列表,以及之后无法更改(无需创建新的优化程序).知道这一点,我认为没有任何解决方案不会传递第一个图形中带有可传递变量名称的列表.例如:

with model1_graph.as_default():
    transferable_names = [v.name for v in tf.get_collection(TRANSFERABLE_VARIABLES)]

然后,在第二张图的构建过程中,在定义模型之后并且在创建优化器之前,您可以执行以下操作:

train_vars = [v for v in tf.get_collection(tf.GrapKeys.TRAINABLE_VARIABLES)
              if v.name not in transferable_names]
# Assuming that `model2_graph` is the current default graph
tf.get_default_graph().clear_collection(tf.GrapKeys.TRAINABLE_VARIABLES)
for v in train_vars:
    tf.add_to_collection(tf.GrapKeys.TRAINABLE_VARIABLES, v)
# Create the optimizer...

另一种选择是不修改集合tf.GrapKeys.TRAINABLE_VARIABLES,而是将要优化的变量列表(在示例中为train_vars)作为参数var_list传递给 解决方案

You can create a tf.Saver specifically for the set of variables of interest and you would be able to restore those in another graph, as long as they have the same name. You could use a collection to store those variables and then create the saver for the collection:

TRANSFERABLE_VARIABLES = "transferable_variable"
# ...
my_var = tf.get_variable(...)
tf.add_to_collection(TRANSFERABLE_VARIABLES, my_var)
# ...
saver = tf.Saver(tf.get_collection(TRANSFERABLE_VARIABLES), ...)

This should allow you to call save in one graph and restore in the other to transfer the weights.

If you want to avoid writing anything to disk, then I don't think there is anything else but manually copy/paste the values. However, this can also be automated to a fair extent by using a collection and the exact same construction process:

model1_graph = create_model1()
model2_graph = create_model2()

with model1_graph.as_default(), tf.Session() as sess:
    # Train...
    # Retrieve learned weights
    transferable_weights = sess.run(tf.get_collection(TRANSFERABLE_VARIABLES))

with model2_graph.as_default(), tf.Session() as sess:
    # Load weights from the other model
    for var, weight in zip(tf.get_collection(TRANSFERABLE_VARIABLES),
                           transferable_weights):
        var.load(weight, sess)
    # Continue training...

Again, this will only work if the construction of the common layers is the same, because the order of the variables in the collection should be the same for both graphs.

Update:

If you want to make sure that the restored variables are not used for training you have a few possibilities, although they may all require more changes in your code. A trainable variable is just a variable that is included in the collection tf.GrapKeys.TRAINABLE_VARIABLES, so you can just say trainable=False when you create the transfered variables in in the second graph and the restoration process should work the same. If you want to be more dynamic and do it automatically it is more or less possible, but keep in mind this: the list of variables that must be used for training must be known before creating the optimizer, and cannot be changed afterwards (without creating a new optimizer). Knowing this, I don't think there is any solution that doesn't pass through passing a list with the names of the transferable variable from the first graph. E.g.:

with model1_graph.as_default():
    transferable_names = [v.name for v in tf.get_collection(TRANSFERABLE_VARIABLES)]

Then, in the construction process of the second graph, after the model is defined and just before creating the optimizer you can do something like this:

train_vars = [v for v in tf.get_collection(tf.GrapKeys.TRAINABLE_VARIABLES)
              if v.name not in transferable_names]
# Assuming that `model2_graph` is the current default graph
tf.get_default_graph().clear_collection(tf.GrapKeys.TRAINABLE_VARIABLES)
for v in train_vars:
    tf.add_to_collection(tf.GrapKeys.TRAINABLE_VARIABLES, v)
# Create the optimizer...

Another option is not to modify the collection tf.GrapKeys.TRAINABLE_VARIABLES and instead pass the list of variables you want to be optimized (train_vars in the example) as the parameter var_list to the minimize method of the optimizer. In principle I personally like this less, because I think the contents of the collections should match their semantic purpose (after all, other parts of the code may use the same collection for other purposes), but it depends on the case I guess.

这篇关于有没有办法使用Tensorflow自动进行迁移学习?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆