如何确定LDA的主题数? [英] how to determine the number of topics for LDA?

查看:479
本文介绍了如何确定LDA的主题数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是LDA的新生,我想在工作中使用它.但是,会出现一些问题.

I am a freshman in LDA and I want to use it in my work. However, some problems appear.

为了获得最佳性能,我想估计最佳主题号.读完《科学发现》后,我知道我可以先计算logP(w | z),然后使用一系列P(w | z)的调和平均值来估计P(w | T).

In order to get the best performance, I want to estimate the best topic number. After reading "Finding Scientific topics", I know that I can calculate logP(w|z) firstly and then use the harmonic mean of a series of P(w|z) to estimate P(w|T).

我的问题是一系列"是什么意思?

My question is what does the "a series of" mean?

推荐答案

不幸的是,没有硬科学对您的问题给出正确的答案.据我所知,分层狄里克雷过程(HDP)可能是到达的最佳方式以最佳的主题数.

Unfortunately, there is no hard science yielding the correct answer to your question. To the best of my knowledge, hierarchical dirichlet process (HDP) is quite possibly the best way to arrive at the optimal number of topics.

如果您正在寻找更深入的分析,请有关HDP的本文报告了HDP在确定组数方面的优势.

If you are looking for deeper analyses, this paper on HDP reports the advantages of HDP in determining the number of groups.

这篇关于如何确定LDA的主题数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆