python中的单词频率不起作用 [英] word frequency in python not working
问题描述
我正在尝试使用python计算文本文件中单词的频率.
I am trying to count frequencies of words in a text file using python.
我正在使用以下代码:
openfile=open("total data", "r")
linecount=0
for line in openfile:
if line.strip():
linecount+=1
count={}
while linecount>0:
line=openfile.readline().split()
for word in line:
if word in count:
count[word]+=1
else:
count[word]=1
linecount-=1
print count
但是我得到一个空字典. 打印计数"给出{}作为输出
But i get an empty dictionary. "print count" gives {} as output
我也尝试使用:
from collections import defaultdict
.
.
count=defaultdict(int)
.
.
if word in count:
count[word]=count.get(word,0)+1
但是我又得到了一个空字典.我不明白我在做什么错.有人可以指出吗?
But i'm getting an empty dictionary again. I dont understand what am i doing wrong. Could someone please point out?
推荐答案
此循环for line in openfile:
将文件指针移动到文件末尾.
因此,如果要再次读取数据,则可以将指针(openfile.seek(0)
)移到文件的开头,或者重新打开文件.
This loop for line in openfile:
moves the file pointer at the end of the file.
So, if you want to read the data again then either move the pointer(openfile.seek(0)
) to the start of the file or re-open the file.
要更好地使用单词频率,请使用Collections.Counter
:
To get the word frequency better use Collections.Counter
:
from collections import Counter
with open("total data", "r") as openfile:
c = Counter()
for line in openfile:
words = line.split()
c.update(words)
这篇关于python中的单词频率不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!