为什么要在gensim word2vec中创建多个模型文件? [英] Why are multiple model files created in gensim word2vec?

查看：62 发布时间：2020/11/13 6:08:10 python word2vec gensim word-embedding

本文介绍了为什么要在gensim word2vec中创建多个模型文件?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我尝试创建word2vec模型(带有负采样的跳过图)时，我收到了3个文件，如下所示.

When I try to create a word2vec model (skipgram with negative sampling) I received 3 files as output as follows.

word2vec (File)
word2vec.syn1nef.npy (NPY file)
word2vec.wv.syn0.npy (NPY file)

我只是担心为什么会发生这种情况，就像我以前在word2vec中的测试示例一样，我只收到一个模型(没有npy文件).

I am just worried why this happens as for my previous test examples in word2vec I only received one model(no npy files).

请帮助我.

推荐答案

具有较大内部矢量数组的模型无法通过Python'pickle'保存到单个文件，因此超过一定阈值的gensim save()方法将使用更有效的numpy数组原始格式(.npy格式)将辅助数组存储在单独的文件中.

Models with larger internal vector-arrays can't be saved via Python 'pickle' to a single file, so beyond a certain threshold, the gensim save() method will store subsidiary arrays in separate files, using the more-efficient raw format of numpy arrays (.npy format).

您仍然通过指定根模型文件名来load()模型；当需要辅助数组时，加载代码将找到辅助文件，只要它们保留在根文件旁边即可.因此，在将模型移到其他位置时，请确保将所有具有相同根文件名的文件放在一起.

You still load() the model by just specifying the root model filename; when the subsidiary arrays are needed, the loading code will find the side files – as long as they're kept beside the root file. So when moving a model elsewhere, be sure to keep all files with the same root filename together.

这篇关于为什么要在gensim word2vec中创建多个模型文件?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么要在gensim word2vec中创建多个模型文件? [英] Why are multiple model files created in gensim word2vec?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

为什么要在gensim word2vec中创建多个模型文件? [英] Why are multiple model files created in gensim word2vec?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭