计算词频并由此制作字典 [英] Counting word frequency and making a dictionary from it

查看：110 发布时间：2020/5/5 13:26:10 python dictionary count readlines

本文介绍了计算词频并由此制作字典的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从文本文件中提取每个单词，然后计算字典中的单词频率.

I want to take every word from a text file, and count the word frequency in a dictionary.

示例:'this is the textfile, and it is used to take words and count'

d = {'this': 1, 'is': 2, 'the': 1, ...}

我还没有那么远，但是我看不出如何完成它.到目前为止，我的代码:

I am not that far, but I just can't see how to complete it. My code so far:

import sys

argv = sys.argv[1]
data = open(argv)
words = data.read()
data.close()
wordfreq = {}
for i in words:
    #there should be a counter and somehow it must fill the dict.

推荐答案

如果您不想使用collections.Counter，则可以编写自己的函数:

If you don't want to use collections.Counter, you can write your own function:

import sys

filename = sys.argv[1]
fp = open(filename)
data = fp.read()
words = data.split()
fp.close()

unwanted_chars = ".,-_ (and so on)"
wordfreq = {}
for raw_word in words:
    word = raw_word.strip(unwanted_chars)
    if word not in wordfreq:
        wordfreq[word] = 0 
    wordfreq[word] += 1

关于更好的东西，请查看正则表达式.

for finer things, look at regular expressions.

这篇关于计算词频并由此制作字典的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算词频并由此制作字典 [英] Counting word frequency and making a dictionary from it

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算词频并由此制作字典 [英] Counting word frequency and making a dictionary from it

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭