计算文本文件中字母的频率 [英] Count frequency of letters in a text file

查看:34
本文介绍了计算文本文件中字母的频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python中,如何遍历文本文件并计算每个字母出现的次数?我意识到我可以只使用for x in file"语句来完成它,然后设置 26 个左右的 if elif 语句,但肯定有更好的方法吗?

谢谢.

解决方案

使用 collections.Counter():

from collections import Counter使用 open(file) 作为 f:c = 计数器()对于 f 中的行:c += 计数器(行)

如果文件不是那么大,可以将其作为字符串全部读入内存,并在一行代码中将其转换为Counter对象:

c = Counter(f.read())

示例:

<预><代码>>>>c = 计数器()>>>c += Counter('aaabbbcccddd eee fff ggg')>>>C计数器({'a':3,'':3,'c':3,'b':3,'e':3,'d':3,'g':3,'f':3})>>>c += Counter('aaabbbccc')计数器({'a':6,'c':6,'b':6,'':3,'e':3,'d':3,'g':3,'f':3})

或使用 count() 字符串的方法:

from string import ascii_lowercase # ascii_lowercase =='abcdefghijklmnopqrstuvwxyz'使用 open(file) 作为 f:文本 = f.read().strip()dic = {}对于 ascii_lowercase 中的 x:dic[x] = text.count(x)

In python, how can I iterate through a text file and count the number of occurrences of each letter? I realise I could just use a 'for x in file' statement to go through it and then set up 26 or so if elif statements, but surely there is a better way to do it?

Thanks.

解决方案

Use collections.Counter():

from collections import Counter
with open(file) as f:
    c = Counter()
    for line in f:
        c += Counter(line)

If the file is not so large, you can read all of it into memory as a string and convert it into a Counter object in one line of code:

c = Counter(f.read())

Example:

>>> c = Counter()
>>> c += Counter('aaabbbcccddd eee fff ggg')
>>> c
Counter({'a': 3, ' ': 3, 'c': 3, 'b': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})
>>> c += Counter('aaabbbccc')
Counter({'a': 6, 'c': 6, 'b': 6, ' ': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})

or use the count() method of strings:

from string import ascii_lowercase     # ascii_lowercase =='abcdefghijklmnopqrstuvwxyz'
with open(file) as f:
    text = f.read().strip()
    dic = {}
    for x in ascii_lowercase:
        dic[x] = text.count(x)

这篇关于计算文本文件中字母的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆