Python计数器单词不是字母 [英] Python counter words not letters
问题描述
我正在尝试创建一个程序,该程序可以读取文本文件并查找单个单词的数量。
我已经解决了大部分问题,但是我坚持要设法让计数器挑选出单词(而不是字母),因为它目前正在这样做。
I'm trying to create a program which reads in a text file and find the number of individual words. I have worked out most of it but I am stuck on trying to get the counter to pick out words not letters as it is currently doing.
import collections
with open ("file.txt" ,"r") as myfile:
data=myfile.read()
[i.split(" ") for i in data]
x=collections.Counter(data)
print (x)
我的目标是将列表按空格隔开,这将导致每个单词列表中的一个对象。但是,这没有用。
My aim was to slip the list by whitespace which would result in each word being a object in the list. This however did not work.
结果:
Counter({' ': 1062, 'e': 678, 't': 544, 'o': 448, 'n': 435, 'a': 405, 'i': 401, 'r': 398, 's': 329, 'c': 268, 'm': 230, 'h': 216, 'u': 212, 'd': 190, 'l': 161, 'p': 148, 'f': 107, 'g': 75, 'y': 68, '\n': 65, ',': 61, 'b': 55, 'w': 55, 'v': 55, '.': 53, 'N': 32, 'A': 20, 'T': 19, '"': 18, ')': 17, '(': 17, 'C': 17, 'k': 16, "'": 16, 'I': 16, 'x': 15, '-': 14, 'E': 13, 'q': 12, 'V': 10, 'U': 9, ';': 7, '1': 6, 'j': 5, '4': 5, 'P': 5, 'D': 5, '9': 5, 'L': 4, 'z': 4, 'W': 4, 'O': 3, 'F': 3, '5': 3, 'J': 2, '3': 2, 'S': 2, 'R': 2, '0': 1, ':': 1, 'H': 1, '2': 1, '/': 1, 'B': 1, 'M': 1, '7': 1})
推荐答案
您的列表理解从未分配,因此不会执行任何操作。
Your list comprehension is never assigned and thus doesn't do anything.
将拆分后的文本传递到 collections.Counter()
:
Pass the split text to collections.Counter()
:
x = collections.Counter(data.split())
我使用了 str.split()
不带参数,以确保您在任意宽度的空格上进行分割,并且在分割时也包括换行符;例如,您的 Counter()
有65个不需要的换行符。
and I used str.split()
without arguments to make sure you split on arbitrary width whitespace and include newlines when splitting as well; your Counter()
has 65 newlines that need not be there, for example.
在上下文中还有更多内容紧凑:
In context and a little more compact:
from collections import Counter
with open ("file.txt") as myfile:
x = Counter(myfile.read().split())
print(x)
这篇关于Python计数器单词不是字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!