加载预训练的word2vec以在Estimator model_fn中初始化embedding_lookup [英] Loading pre-trained word2vec to initialise embedding_lookup in the Estimator model_fn
问题描述
我正在解决文本分类问题.我使用自己的model_fn
和Estimator
类定义了分类器.我想使用Google预先训练的word2vec
嵌入作为初始值,然后针对手头的任务对其进行进一步优化.
I am solving a text classification problem. I defined my classifier using the Estimator
class with my own model_fn
. I would like to use Google's pre-trained word2vec
embedding as initial values and then further optimise it for the task at hand.
我看到了这篇文章:使用TensorFlow中预先训练的词嵌入(word2vec或Glove)
其中说明了如何在原始" TensorFlow代码中进行操作.但是,我真的很想使用Estimator
类.
I saw this post: Using a pre-trained word embedding (word2vec or Glove) in TensorFlow
which explains how to go about it in 'raw' TensorFlow code. However, I would really like to use the Estimator
class.
作为扩展,我想在Cloud ML Engine上训练此代码,是否有很好的方法来传递带有初始值的相当大的文件?
As an extension, I would like to then train this code on Cloud ML Engine, is there a good way of passing in the fairly large file with initial values?
假设我们有类似的东西:
Let's say we have something like:
def build_model_fn():
def _model_fn(features, labels, mode, params):
input_layer = features['feat'] #shape=[-1, params["sequence_length"]]
#... what goes here to initialize W
embedded = tf.nn.embedding_lookup(W, input_layer)
...
return predictions
estimator = tf.contrib.learn.Estimator(
model_fn=build_model_fn(),
model_dir=MODEL_DIR,
params=params)
estimator.fit(input_fn=read_data, max_steps=2500)
推荐答案
嵌入通常足够大,以至于唯一可行的方法是使用它们来初始化图形中的tf.Variable
.这将使您能够利用分布式等中的参数服务器.
Embeddings are typically large enough that the only viable approach is using them to initialize a tf.Variable
in your graph. This will allow you to take advantage of param servers in distributed, etc.
For this (and anything else), I would recommend you use the new "core" estimator, tf.estimator.Estimator
as this will make things much easier.
根据您提供的链接中的答案,并且知道我们想要一个变量而不是一个常量,我们可以采用以下方法:
From the answer in the link you provided, and knowing that we want a variable not a constant, we can either take approach:
(2)使用提要dict初始化变量,或者 (3)从检查点加载变量
(2) Initialize the variable using a feed dict, or (3) Load the variable from a checkpoint
我将首先介绍选项(3),因为它更容易且更好:
I'll cover option (3) first since it's much easier, and better:
In your model_fn
, simply initialize a variable using the Tensor
returned by a tf.contrib.framework.load_variable
call. This requires:
- 您的嵌入中有一个有效的TF检查点
- 您知道检查点内embeddings变量的完全限定名称.
- That you have a valid TF checkpoint with your embeddings
- You know the fully qualified name of the embeddings variable within the checkpoint.
代码非常简单:
def model_fn(mode, features, labels, hparams):
embeddings = tf.Variable(tf.contrib.framework.load_variable(
'gs://my-bucket/word2vec_checkpoints/',
'a/fully/qualified/scope/embeddings'
))
....
return tf.estimator.EstimatorSpec(...)
但是,如果您的嵌入不是由另一个TF模型生成的,那么这种方法将对您不起作用,因此选择(2).
However this approach won't work for you if your embeddings weren't produced by another TF model, hence option (2).
对于(2),我们需要使用 tf.train.Scaffold
本质上是一个配置对象,其中包含用于启动tf.Session
的所有选项(由于许多原因,估算器有意隐藏了该对象).
For (2), we need to use tf.train.Scaffold
which is essentially a configuration object that holds all the options for starting a tf.Session
(which estimator intentionally hides for lots of reasons).
您可以在 tf.train.EstimatorSpec
Scaffold >您返回model_fn
.
You may specify a Scaffold
in the tf.train.EstimatorSpec
you return in your model_fn
.
我们在model_fn中创建一个占位符,并将其设置为
嵌入变量的初始化操作,然后通过Scaffold
传递init_feed_dict
.例如
We create a placeholder in our model_fn, and make it the
initializer operation for our embedding variable, then pass an init_feed_dict
via the Scaffold
. e.g.
def model_fn(mode, features, labels, hparams):
embed_ph = tf.placeholder(
shape=[hparams.vocab_size, hparams.embedding_size],
dtype=tf.float32)
embeddings = tf.Variable(embed_ph)
# Define your model
return tf.estimator.EstimatorSpec(
..., # normal EstimatorSpec args
scaffold=tf.train.Scaffold(init_feed_dict={embed_ph: my_embedding_numpy_array})
)
这里发生的是init_feed_dict
将在运行时填充embed_ph
占位符的值,然后将允许运行embeddings.initialization_op
(占位符的分配).
What's happening here is the init_feed_dict
will populate the values of the embed_ph
placeholder at runtime, which will then allow the embeddings.initialization_op
(assignment of the placeholder), to run.
这篇关于加载预训练的word2vec以在Estimator model_fn中初始化embedding_lookup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!