在Tensorflow中培训wordvec,导入Gensim [英] Training wordvec in Tensorflow, importing to Gensim

查看:168
本文介绍了在Tensorflow中培训wordvec,导入Gensim的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从tensorflow教程中训练word2vec模型.

I am training a word2vec model from the tensorflow tutorial.

https://github.com /tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py

训练后,我得到了嵌入矩阵.我想保存它,并将其作为gensim中训练有素的模型导入.

After training I get the embedding matrix. I would like to save this and import it as a trained model in gensim.

要在gensim中加载模型,命令为:

To load a model in gensim, the command is:

model = Word2Vec.load_word2vec_format(fn, binary=True)

但是如何从Tensorflow生成fn文件?

But how do I generate the fn file from Tensorflow?

谢谢

推荐答案

一种方法是以非二进制Word2Vec格式保存文件,该格式本质上是这样的:

One way to is save the file in the non-binary Word2Vec format, which essentially looks like this:

num_words vector_size  # this is the header
label0 x00 x01 ... x0N
label1 x10 x11 ... x1N
...

示例:

2 3
word0 -0.000737 -0.002106 0.001851
word1 -0.000878 -0.002106 0.002834

保存文件,然后使用kwarg binary=False加载:

Save the file and then load with kwarg binary=False:

model = Word2Vec.load_word2vec_format(filename, binary=False)

print(model['word0'])

更新

加载模型的新方法是:

Update

New way to load model is:

from gensim.models.keyedvectors import KeyedVectors

model = KeyedVectors.load_word2vec_format(model_path, binary=False)

这篇关于在Tensorflow中培训wordvec,导入Gensim的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆