绘制单词频率和NLTK [英] Plotting words frequency and NLTK

查看:79
本文介绍了绘制单词频率和NLTK的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多个单词的文件,我想计算文档中每个单词的频率并将其绘制出来. 但是,我的情节没有显示结果. x-axis必须包含单词,而y-axis必须包含频率. 我正在使用NLTKNumPyMatplotlib

I have a file with various words, which I want to count the frequency of each word in the document and plot it. However, my plot is not showing results. The x-axis must contain the words, and the y-axis the frequency. I am using NLTK, NumPy and Matplotlib

这是我的代码,也许我做错了事

Here's my code, maybe I did something wrong

def graph():
    f = open("file.txt", "r")
    inputfile = f.read()
    words = nltk.tokenize.word_tokenize(inputfile)
    count = set(words)
    dic = nltk.FreqDist(words)
    FreqDist(f).plot(50, cumulative=False)
    f.close()

预先感谢您的帮助

推荐答案

def graph():
  f = open("file.txt", "r")
  inputfile = f.read()
  tokens = nltk.tokenize.word_tokenize(inputfile)
  fd = nltk.FreqDist(tokens)
  fd.plot(30,cumulative=False)

您可以通过更改plot()的参数来玩图

You can play with the graph by altering the parameters to the plot()

这篇关于绘制单词频率和NLTK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆