如何在Python中从LDA模型生成词云? [英] How to generate word clouds from LDA models in Python?

查看:566
本文介绍了如何在Python中从LDA模型生成词云?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对报纸文章进行一些主题建模,并且已经在python3中使用gensim实现了LDA.现在,我想为每个主题创建一个词云,并使用每个主题的前20个词.我知道我可以打印单词,并保存LDA模型,但是有什么方法可以保存每个主题的重要单词,我可以将其进一步用于生成单词云?

I am doing some topic modeling on newspaper articles, and have implemented LDA using gensim in Python3. Now I want to create a word cloud for each topic, using the top 20 words for each topic. I know I can print the words, and save the LDA model, but is there any way to just save the top words for each topic which I can further use for generating word clouds?

我试图用Google搜索它,但找不到任何相关内容.感谢您的帮助.

I tried to google it, but could not find anything relevant. Any help is appreciated.

推荐答案

您可以使用Gensim的内置方法show_topic从LDA模型中获取最重要的单词.

You can get the topn words from an LDA model using Gensim's built-in method show_topic.

lda = models.LdaModel.load('lda.model')

for i in range(0, lda.num_topics):
    with open('output_file.txt', 'w') as outfile:
        outfile.write('{}\n'.format('Topic #' + str(i + 1) + ': '))
        for word, prob in lda.show_topic(i, topn=20):
            outfile.write('{}\n'.format(word.encode('utf-8')))
        outfile.write('\n')

这将以类似于以下格式的格式写入文件:

This will write a file with a format similar to this:

Topic #69: 
pet
dental
tooth
adopt
animal
puppy
rescue
dentist
adoption
animal
shelter
pet
dentistry
vet
paw
pup
patient
mix
foster
owner

Topic #70: 
periscope
disneyland
disney
snapchat
brandon
britney
periscope
periscope
replay
britneyspear
buffaloexchange
britneyspear
https
meerkat
blab
periscope
kxci
toni
disneyland
location

您可能需要调整,也可能不需要,即生成前20个单词的列表,而不是将其输出到文本文件中.

You may or may not need to adjust this to your needs, ie yield a list of top 20 words instead of outputting it to a text file.

这篇文章中的答案很好地解释了如何使用原始文本创建单词cloud.

The answer in this post gives a good explanation of how to use raw text to create the word clouds. How do I print lda topic model and the word cloud of each of the topics

这篇关于如何在Python中从LDA模型生成词云?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆