python中的单词频率不起作用 [英] word frequency in python not working

查看:62
本文介绍了python中的单词频率不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用python计算文本文件中单词的频率.

I am trying to count frequencies of words in a text file using python.

我正在使用以下代码:

openfile=open("total data", "r")

linecount=0
for line in openfile:
    if line.strip():
        linecount+=1

count={}

while linecount>0:
    line=openfile.readline().split()
    for word in line:
        if word in count:
            count[word]+=1
        else:
            count[word]=1
    linecount-=1

print count

但是我得到一个空字典. 打印计数"给出{}作为输出

But i get an empty dictionary. "print count" gives {} as output

我也尝试使用:

from collections import defaultdict
.
.
count=defaultdict(int)
.
.
     if word in count:
          count[word]=count.get(word,0)+1

但是我又得到了一个空字典.我不明白我在做什么错.有人可以指出吗?

But i'm getting an empty dictionary again. I dont understand what am i doing wrong. Could someone please point out?

推荐答案

此循环for line in openfile:将文件指针移动到文件末尾. 因此,如果要再次读取数据,则可以将指针(openfile.seek(0))移到文件的开头,或者重新打开文件.

This loop for line in openfile: moves the file pointer at the end of the file. So, if you want to read the data again then either move the pointer(openfile.seek(0)) to the start of the file or re-open the file.

要更好地使用单词频率,请使用Collections.Counter:

To get the word frequency better use Collections.Counter:

from collections import Counter
with open("total data", "r") as openfile:
   c = Counter()
   for line in openfile:
      words = line.split()
      c.update(words)

这篇关于python中的单词频率不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆