R监督的潜在Dirichlet分配程序包 [英] R Supervised Latent Dirichlet Allocation Package

查看:115
本文介绍了R监督的潜在Dirichlet分配程序包的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为R使用此LDA软件包.具体来说,我是尝试监督潜在狄利克雷分配(slda).在链接的程序包中,有一个slda.em函数.但是,令我困惑的是它要求提供alpha,eta和方差参数.据我了解,我认为这些参数在模型中是未知的.所以我的问题是,软件包的作者是说这些是对这些参数的初步猜测吗?如果是,则似乎没有办法从运行slda.em的结果访问它们.

I'm using this LDA package for R. Specifically I am trying to do supervised latent dirichlet allocation (slda). In the linked package, there's an slda.em function. However what confuses me is that it asks for alpha, eta and variance parameters. As far as I understand, I thought these parameters are unknowns in the model. So my question is, did the author of the package mean to say that these are initial guesses for the parameters? If yes, there doesn't seem to be a way of accessing them from the result of running slda.em.

除了在算法中编码额外的EM步骤外,还有建议的方法来猜测这些参数的合理值吗?

Aside from coding the extra EM steps in the algorithm, is there a suggested way to guess reasonable values for these parameters?

推荐答案

由于您尝试生成监督模型,因此典型的方法是使用交叉验证来确定模型参数.因此,您保留一些数据作为测试集,在剩余数据上训练模型,并评估模型性能,重复k次.然后,您继续使用不同的模型参数重复进行操作,以确定哪种结果可获得最佳的模型性能.

Since you are trying to generate a supervised model, the typical approach would be to use cross validation to determine the model parameters. So you hold out some of the data as your test set, train the a model on the remaining data, and evaluate the model performance, repeating k times. You then continue to repeat with different model parameters to determine which result in the best model performance.

在slda的特定情况下,我将运行demo(slda)来查看作者的实现.运行演示时,您会看到他设置了alpha=1.0eta=0.1variance=0.25.我建议以这些为起点,如果需要提高模型性能,请使用交叉验证来确定更好的参数.

In the specific case of slda, I would run demo(slda) to see the author's implementation of it. When you run the demo, you'll see that he sets alpha=1.0, eta=0.1, and variance=0.25. I'd suggest using these as your starting point, and then use cross validation to determine better parameters if you need to improve model performance.

这篇关于R监督的潜在Dirichlet分配程序包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆