尝试在文本文件中输出x个最常用的单词 [英] Trying to output the x most common words in a text file
问题描述
我正在尝试编写一个程序,该程序将在文本文件中读取并输出最常见单词(现在写入代码时为30)及其计数的列表.像这样:
I'm trying to write a program that will read in a text file and output a list of most common words (30 as the code is written now) along with their counts. so something like:
word1 count1
word2 count2
word3 count3
... ...
... ...
wordn countn
按count1> count2> count3> ...> countn的顺序.到目前为止,这是我所拥有的,但是我无法获得排序后的函数来执行所需的功能.我现在得到的错误是:
in order of count1 > count2 > count3 >... >countn. This is what I have so far but I cannot get the sorted function to perform what I want. The error I get now is:
TypeError: list indices must be integers, not tuple
我是python的新手.任何帮助,将不胜感激.谢谢.
I'm new to python. Any help would be appreciated. Thank you.
def count_func(dictionary_list):
return dictionary_list[1]
def print_top(filename):
word_list = {}
with open(filename, 'r') as input_file:
count = 0
#best
for line in input_file:
for word in line.split():
word = word.lower()
if word not in word_list:
word_list[word] = 1
else:
word_list[word] += 1
#sorted_x = sorted(word_list.items(), key=operator.itemgetter(1))
# items = sorted(word_count.items(), key=get_count, reverse=True)
word_list = sorted(word_list.items(), key=lambda x: x[1])
for word in word_list:
if (count > 30):#19
break
print "%s: %s" % (word, word_list[word])
count += 1
# This basic command line argument parsing code is provided and
# calls the print_words() and print_top() functions which you must define.
def main():
if len(sys.argv) != 3:
print 'usage: ./wordcount.py {--count | --topcount} file'
sys.exit(1)
option = sys.argv[1]
filename = sys.argv[2]
if option == '--count':
print_words(filename)
elif option == '--topcount':
print_top(filename)
else:
print 'unknown option: ' + option
sys.exit(1)
if __name__ == '__main__':
main()
推荐答案
使用一些不请自来的建议:在一切都作为一个大代码块工作之前,不要做那么多函数.重构为之后的函数即可.您甚至不需要这么小的脚本的主要部分.
Some unsolicited advice: Don't make so many functions until everything is working as one big block of code. Refactor into functions after it works. You don't even need a main section for a script this small.
这篇关于尝试在文本文件中输出x个最常用的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!