如何在vowpal wabbit中计算LDA模型的对数似然 [英] How to compute the log-likelihood of the LDA model in vowpal wabbit
问题描述
我是典型的常规R用户.在 R 中有非常有用的 lda.collapsed.gibbs.sampler
在
I am typical, regular, everyday R user. In R there is very helpful lda.collapsed.gibbs.sampler
in lda
package tha uses a collapsed Gibbs sampler to fit a latent Dirichlet allocation (LDA) model and returns point estimates of the latent parameters using the state at the last iteration of Gibbs sampling.
此函数还有一个很棒的参数 compute.log.likelihood
,当设置为 TRUE
时,它将导致采样器计算日志每次扫描后,单词的可能性(在恒定因子之内)变量.这对于评估收敛性和比较不同的LDA模型(针对不同主题数进行计算)很有用.
This function also has a great parameter compute.log.likelihood
which, when set to TRUE
, will cause the sampler to compute the log
likelihood of the words (to within a constant factor) after each sweep over the
variables. This is useful for assessing convergence and in comparing different LDA models (computeted for different number of topics).
我对 vowpal_wabbit的LDA中是否有这样的选择感兴趣模型?
推荐答案
运行 vw -h --lda 1
时,帮助提供以下参数.默认情况下, metrics
参数处于关闭状态.它用于计算实现主题一致性的.尝试通过传递-metrics1
When running vw -h --lda 1
the help offers the following parameters.
The metrics
parameter is off by default.
It is used to compute the topic coherence which is implemented here.
Try to enable this functionality by passing --metrics 1
Latent Dirichlet Allocation:
--lda arg Run lda with <int> topics
--lda_alpha arg (=0.100000001) Prior on sparsity of per-document topic
weights
--lda_rho arg (=0.100000001) Prior on sparsity of topic
distributions
--lda_D arg (=10000) Number of documents
--lda_epsilon arg (=0.00100000005) Loop convergence threshold
--minibatch arg (=1) Minibatch size, for LDA
--math-mode arg (=0) Math mode: simd, accuracy, fast-approx
--metrics arg (=0) Compute metrics
或直接跳转到 vw的源代码实用程序.
可以在此处找到有用的演示文稿,其中展示了大多数参数.
A helpful presentation showcasing most parameters can be found here.
这篇关于如何在vowpal wabbit中计算LDA模型的对数似然的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!