计算GenSim上看不见的文档的主题分布 [英] Calculating topic distribution of an unseen document on GenSim

查看：87 发布时间：2020/4/30 8:39:15 python nlp gensim lda

本文介绍了计算GenSim上看不见的文档的主题分布的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用GenSim的LDA模块执行以下任务

I am trying to use LDA module of GenSim to do the following task

使用一个大文档训练LDA模型，并跟踪10个潜在主题.给定一个新的，看不见的文档，预测10个潜在主题的概率分布."

"Train a LDA model with one big document and keep track of 10 latent topics. Given a new, unseen document, predict probability distribution of 10 latent topics".

根据此处的教程: http://radimrehurek.com/gensim/tut2.html，这似乎可以用于语料库中的文档，但是我想知道是否可能存在看不见的文档.

As per tutorial here: http://radimrehurek.com/gensim/tut2.html, this seems possible for a document in a corpus, but I am wondering if it it would be possible for an unseen document.

谢谢！

推荐答案

从您发布的文档看来，您可以像这样训练模型:

From the documentation you posted it looks like you can train your model like this:

>>> model = models.LdaModel(corpus, id2word=dictionary, num_topics=100)

然后从此页面看来，您可以在看不见的文档":

And then from this page it looks like you can apply your model on "an unseen document" like this:

>>> doc_lda = model[doc_bow]

其中doc_bow是 doc2bow 工具.

Where doc_bow is a bag-of-words generated by the doc2bow tool.

这篇关于计算GenSim上看不见的文档的主题分布的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算GenSim上看不见的文档的主题分布 [英] Calculating topic distribution of an unseen document on GenSim

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算GenSim上看不见的文档的主题分布 [英] Calculating topic distribution of an unseen document on GenSim

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭