txt文件中的Python计数器 [英] Python Counter from txt file

查看:75
本文介绍了txt文件中的Python计数器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从一个单词频率计数的文本文件中初始化一个collections.Counter对象。也就是说,我有一个文件 counts.txt:

I would like to init a collections.Counter object from a text file of word frequency counts. That is, I have a file "counts.txt":

rank  wordform         abs     r        mod
   1  the           225300    29   223066.9
   2  and           157486    29   156214.4
   3  to            134478    29   134044.8
...
 999  fallen           345    29      326.6
1000  supper           368    27      325.8

我想要一个Counter对象 wordCounts 以便可以调用

I would like a Counter object wordCounts such that I can call

>>> print wordCounts.most_common(3)
[('the', 225300), ('of', 157486), ('and', 134478)]

最有效的Python方式

What is the most efficient, Pythonic way

推荐答案

两个版本。第一个将您的 counts.txt 作为常规文本文件。第二个将其视为一个csv文件(看起来像这样)。

Here are two versions. The first takes your counts.txt as a regular text file. The second treats it as a csv file (which is what it kind of looks like).

from collections import Counter

with open('counts.txt') as f:
    lines = [line.strip().split() for line in f]
    wordCounts = Counter({line[1]: int(line[2]) for line in lines[1:]})
    print wordCounts.most_common(3)

如果您的数据文件被证明是由某些一致的字符或字符串分隔,您可以使用 csv.DictReader 对象来解析文件。

If your data file some how turned out to be delimited by some consistent character or string you could use a csv.DictReader object to parse the file.

如下所示如果文件是用 TAB 分隔的,那么该怎么做。

Shown below is how it could be done IF your file were TAB delimited.

数据文件(由我编辑为制表符分隔的文件)

The data file (as edited by me to be TAB delimited)

rank    wordform    abs r   mod
1   the 225300  29  223066.9
2   and 157486  29  156214.4
3   to  134478  29  134044.8
999 fallen  345 29  326.6
1000    supper  368 27  325.8

代码

from csv import DictReader
from collections import Counter

with open('counts.txt') as f:
    reader = DictReader(f, delimiter='\t')
    wordCounts = Counter({row['wordform']: int(row['abs']) for row in reader})
    print wordCounts.most_common(3)

这篇关于txt文件中的Python计数器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆