为什么斯坦福主题建模工具箱不生成lda-output目录? [英] Why isn't Stanford Topic Modeling Toolbox producing lda-output directory?

查看:103
本文介绍了为什么斯坦福主题建模工具箱不生成lda-output目录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试运行github上的代码(按照1-2-3步骤),在Sarah Palin的14,500封电子邮件中确定了30个主题.作者发现的主题是此处.但是,斯坦福主题建模工具箱并未为我生成lda-output目录.它生成了lda-86a58136-30-2b1a90a6,但是此文件夹中的summary.txt仅显示主题的初始分配,而不显示最终主题.任何想法如何产生带有已发现主题的最终摘要的lda-output目录?预先感谢!

I tried to run this code from github (following the 1-2-3 steps) which identifies 30 topics in Sarah Palin's 14,500 emails. The topics discovered by the author are here. However, Stanford Topic Modeling Toolbox is not producing lda-output directory for me. It produced the lda-86a58136-30-2b1a90a6, but the summary.txt in this folder only shows the initial assignment of topics, not the final one. Any idea how to produce lda-output directory with the final summary of topics discovered? Thanks in advance!

推荐答案

您是否尝试过以下说明

Have you tried the instructions posted here?

请注意,我看到原始研究人员使用Sarah Palin的电子邮件训练了该模型,然后使用该训练后的模型来分析Sarah Palin的电子邮件.虽然我不是LDA专家,但通常会冒充找到您拥有的东西".

Note that I see the original investigator trained the model with Sarah Palin's emails, and then used that trained model to analyze Sarah Palin's emails. While I am not an LDA expert, this typically smacks of "finding what you have".

在大多数学科中,将对一组已知的项目进行培训,这些项目已根据专家的区分进行了分类.这意味着培训将包括从其他来源获取已知可能主题中的一组数据,然后使用LDA库确定距学习"数据库中主题的距离.

In most disciplines, training would be done over a known set of items which had been classified according to discriminant by experts. This means that the training would consist of feeding a set of data in known likely topics from other sources, and then would use the LDA library to determine distance from the topics in the "learned" database.

无论如何,祝你好运.

In any event, good luck.

如果遇到特定问题,请发布错误,以及为解决该错误而采取的步骤.很少有人会花时间尝试重现问题(纠正问题的典型先决条件)而没有方向,甚至没有能力确定他们遇到的问题是否与您的相类似.

In the event you encounter a specific issue, please post the error, and the steps you took to arrive at that error. Few people invest the time to attempt to reproduce an issue (a typical prerequisite for correcting an issue) without direction, or even the ability to determine if their encountered issue is similar to yours.

这篇关于为什么斯坦福主题建模工具箱不生成lda-output目录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆