计算word2vec模型的困惑度 [英] Calculate perplexity of word2vec model

查看：256 发布时间：2020/5/18 1:03:39 python nlp gensim word2vec language-model

本文介绍了计算word2vec模型的困惑度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我用50万个句子(约6万个)训练了Gensim W2V模型，我想计算困惑度.

I trained Gensim W2V model on 500K sentences (around 60K) words and I want to calculate the perplexity.

这样做的最好方法是什么?
对于6万个单词，我该如何检查适当的数据量?

谢谢

推荐答案

如果要计算困惑度，则必须首先获取损失. 在gensim.models.word2vec.Word2Vec构造函数上，传递compute_loss=True参数-这样，gensim将为您存储训练时的损失. 培训后，您可以调用 get_latest_training_loss() 提取损失的方法.

If you want to calculate the perplexity, you have first to retrieve the loss. On the gensim.models.word2vec.Word2Vec constructor, pass the compute_loss=True parameter - this way, gensim will store the loss for you while training. Once trained, you can call the get_latest_training_loss() method to retrieve the loss.

由于skip-gram模型的交叉熵损失中的损失，损失的幂为2将给您带来困惑. (2 **损失)

Since the loss in the cross-entropy loss of the skip-gram model, 2 to the power of the loss will give you the preplexity. (2**loss)

这篇关于计算word2vec模型的困惑度的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算word2vec模型的困惑度 [英] Calculate perplexity of word2vec model

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算word2vec模型的困惑度 [英] Calculate perplexity of word2vec model

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭