加载预训练的word2vec以在Estimator model_fn中初始化embedding_lookup [英] Loading pre-trained word2vec to initialise embedding_lookup in the Estimator model_fn

查看:422
本文介绍了加载预训练的word2vec以在Estimator model_fn中初始化embedding_lookup的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解决文本分类问题.我使用自己的model_fnEstimator类定义了分类器.我想使用Google预先训练的word2vec嵌入作为初始值,然后针对手头的任务对其进行进一步优化.

I am solving a text classification problem. I defined my classifier using the Estimator class with my own model_fn. I would like to use Google's pre-trained word2vec embedding as initial values and then further optimise it for the task at hand.

我看到了这篇文章:使用TensorFlow中预先训练的词嵌入(word2vec或Glove)
其中说明了如何在原始" TensorFlow代码中进行操作.但是,我真的很想使用Estimator类.

I saw this post: Using a pre-trained word embedding (word2vec or Glove) in TensorFlow
which explains how to go about it in 'raw' TensorFlow code. However, I would really like to use the Estimator class.

作为扩展,我想在Cloud ML Engine上训练此代码,是否有很好的方法来传递带有初始值的相当大的文件?

As an extension, I would like to then train this code on Cloud ML Engine, is there a good way of passing in the fairly large file with initial values?

假设我们有类似的东西:

Let's say we have something like:

def build_model_fn():
    def _model_fn(features, labels, mode, params):
        input_layer = features['feat'] #shape=[-1, params["sequence_length"]]
        #... what goes here to initialize W

        embedded = tf.nn.embedding_lookup(W, input_layer)
        ...
        return predictions

estimator = tf.contrib.learn.Estimator(
    model_fn=build_model_fn(),
    model_dir=MODEL_DIR,
    params=params)
estimator.fit(input_fn=read_data, max_steps=2500)

推荐答案

嵌入通常足够大,以至于唯一可行的方法是使用它们来初始化图形中的tf.Variable.这将使您能够利用分布式等中的参数服务器.

Embeddings are typically large enough that the only viable approach is using them to initialize a tf.Variable in your graph. This will allow you to take advantage of param servers in distributed, etc.

为此(以及其他所有内容),我建议您使用新的核心"估算器,

For this (and anything else), I would recommend you use the new "core" estimator, tf.estimator.Estimator as this will make things much easier.

根据您提供的链接中的答案,并且知道我们想要一个变量而不是一个常量,我们可以采用以下方法:

From the answer in the link you provided, and knowing that we want a variable not a constant, we can either take approach:

(2)使用提要dict初始化变量,或者 (3)从检查点加载变量

(2) Initialize the variable using a feed dict, or (3) Load the variable from a checkpoint

我将首先介绍选项(3),因为它更容易且更好:

I'll cover option (3) first since it's much easier, and better:

在您的model_fn中,只需使用

In your model_fn, simply initialize a variable using the Tensor returned by a tf.contrib.framework.load_variable call. This requires:

  1. 您的嵌入中有一个有效的TF检查点
  2. 您知道检查点内embeddings变量的完全限定名称.
  1. That you have a valid TF checkpoint with your embeddings
  2. You know the fully qualified name of the embeddings variable within the checkpoint.

代码非常简单:

def model_fn(mode, features, labels, hparams):
  embeddings = tf.Variable(tf.contrib.framework.load_variable(
      'gs://my-bucket/word2vec_checkpoints/',
      'a/fully/qualified/scope/embeddings'
  ))
  ....
  return tf.estimator.EstimatorSpec(...)

但是,如果您的嵌入不是由另一个TF模型生成的,那么这种方法将对您不起作用,因此选择(2).

However this approach won't work for you if your embeddings weren't produced by another TF model, hence option (2).

对于(2),我们需要使用 tf.train.Scaffold 本质上是一个配置对象,其中包含用于启动tf.Session的所有选项(由于许多原因,估算器有意隐藏了该对象).

For (2), we need to use tf.train.Scaffold which is essentially a configuration object that holds all the options for starting a tf.Session (which estimator intentionally hides for lots of reasons).

您可以在 tf.train.EstimatorSpec Scaffold >您返回model_fn.

You may specify a Scaffold in the tf.train.EstimatorSpec you return in your model_fn.

我们在model_fn中创建一个占位符,并将其设置为 嵌入变量的初始化操作,然后通过Scaffold传递init_feed_dict.例如

We create a placeholder in our model_fn, and make it the initializer operation for our embedding variable, then pass an init_feed_dict via the Scaffold. e.g.

def model_fn(mode, features, labels, hparams):
  embed_ph = tf.placeholder(
      shape=[hparams.vocab_size, hparams.embedding_size], 
      dtype=tf.float32)
  embeddings = tf.Variable(embed_ph)
  # Define your model
  return tf.estimator.EstimatorSpec(
      ..., # normal EstimatorSpec args
      scaffold=tf.train.Scaffold(init_feed_dict={embed_ph: my_embedding_numpy_array})
  )

这里发生的是init_feed_dict将在运行时填充embed_ph占位符的值,然后将允许运行embeddings.initialization_op(占位符的分配).

What's happening here is the init_feed_dict will populate the values of the embed_ph placeholder at runtime, which will then allow the embeddings.initialization_op (assignment of the placeholder), to run.

这篇关于加载预训练的word2vec以在Estimator model_fn中初始化embedding_lookup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆