绘制单词频率和NLTK [英] Plotting words frequency and NLTK
本文介绍了绘制单词频率和NLTK的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含多个单词的文件,我想计算文档中每个单词的频率并将其绘制出来.
但是,我的情节没有显示结果.
x-axis
必须包含单词,而y-axis
必须包含频率.
我正在使用NLTK
,NumPy
和Matplotlib
I have a file with various words, which I want to count the frequency of each word in the document and plot it.
However, my plot is not showing results.
The x-axis
must contain the words, and the y-axis
the frequency.
I am using NLTK
, NumPy
and Matplotlib
这是我的代码,也许我做错了事
Here's my code, maybe I did something wrong
def graph():
f = open("file.txt", "r")
inputfile = f.read()
words = nltk.tokenize.word_tokenize(inputfile)
count = set(words)
dic = nltk.FreqDist(words)
FreqDist(f).plot(50, cumulative=False)
f.close()
预先感谢您的帮助
推荐答案
def graph():
f = open("file.txt", "r")
inputfile = f.read()
tokens = nltk.tokenize.word_tokenize(inputfile)
fd = nltk.FreqDist(tokens)
fd.plot(30,cumulative=False)
您可以通过更改plot()的参数来玩图
You can play with the graph by altering the parameters to the plot()
这篇关于绘制单词频率和NLTK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文