计算数组python中项目的出现 [英] counting occurrences of items in an array python
问题描述
该程序的目的是读取文件,将所有单词更改为单独的标记,然后将这些标记放入数组中.然后,程序将删除所有标点符号,并将所有字母更改为小写字母.然后,程序应计算每个命令行参数在数组中出现的次数,并打印结果.我的程序能够成功创建一个数组,这些数组包含多个标点符号,小写字母.我现在的问题是如何遍历数组并计算特定单词的出现次数,以及如何在主函数中调用这些函数.我的拆线功能按书面规定工作
The purpose of this program is to read in a file, change all the words into individual tokens, and place these tokens into an array. The program then removes all punctuation and changes all letters to lowercase. Then the program should count how many times each command line argument occurs in the array, and print the result. My program is able to successfully create an array of depunctuated, lowercase tokens. My problem now is how to loop through the array and count the occurrences of a particular word, and how I should call these functions in the main function. My depunctuate function works as written
这是我的程序:
import sys
from scanner import *
def main():
print("the name of the program is",sys.argv[0])
for i in range(1,len(sys.argv),1):
print(" argument",i,"is", sys.argv[i])
tokens = readTokens("text.txt")
cleanTokens = depunctuateTokens(tokens)
words = [token.lower() for token in cleanTokens]
count = find(words)
print(words)
print(count)
def readTokens(s):
arr=[]
s=Scanner("text.txt")
token=s.readtoken()
while (token != ""):
arr.append(token)
token=s.readtoken()
s.close()
return arr
def depunctuateTokens(arr):
result=[]
for i in range(0,len(arr),1):
string=arr[i]
cleaned=""
punctuation="""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
for i in range(0,len(string),1):
if string[i] not in punctuation:
cleaned += string[i]
result.append(cleaned)
return result
def find(tokens,words):
return occurences(tokens,words)>0
def occurences(tokens,words):
count = 0
for i in range(0,len(words),1):
if (words[i] == tokens):
count += 1
return count
main()
推荐答案
您现有的功能距离不太远:
Your existing function isn't too far off:
def occurences(tokens,words):
count = 0
for i in range(0,len(words),1):
if (words[i] == tokens):
count += 1
return count
第一个问题是您已缩进for
循环内的return count
.这意味着它将在每次循环时都return
,这意味着它将仅处理第一个单词.因此,如果第一个单词匹配,它将返回1,否则返回0.只是不确定return
,问题就消失了.
The first problem is that you've indented the return count
inside the for
loop. That means it will return
each time through the loop, which means it will only ever process the first word. So, it will return 1 if the first word matches, 0 otherwise. Just unindent that return
and that problem goes away.
第二个问题是,根据参数的名称判断,您期望tokens
和words
都是字符串列表.因此,单个单词words[i]
永远不会匹配整个标记列表.也许您想测试该单词是否与列表中的所有标记匹配,而不是与列表中的所有标记匹配?在这种情况下,您会写:
The second problem is that, judging by the names of the parameters, you're expecting both tokens
and words
to be lists of strings. So, a single word words[i]
is never going to match a whole list of tokens. Maybe you wanted to test whether that word matches any of the tokens in the list, instead of whether it matches the list? In that case, you'd write:
if words[i] in tokens:
最后,虽然您的find
函数似乎正确调用了occurences
(嗯,您拼写了occurrences
错了,但是您一直这样做,所以没关系),但实际上您并没有正确地调用find
,所以你永远不会到这里.您的通话看起来像这样:
Finally, while your find
function seems to call occurences
properly (well, you spelled occurrences
wrong, but you did so consistently, so that's OK), you don't actually call find
properly, so you'll never get here. Your call looks like this:
count = find(words)
…但是您的定义是这样的:
… but your definition like this:
def find(tokens,words):
您必须将某物传递给该tokens
参数.我不确定要通过什么,但是您是设计和编写此代码的人.您为该函数编写了什么?
You have to pass something to that tokens
parameter. I'm not sure what to pass—but you're the one who designed and wrote this code; what did you write the function for?
我怀疑您真正在寻找的是每个令牌的数量.在这种情况下,对于您的设计,find
和occurrences
实际上都应使用单个token
,而不是tokens
的列表作为参数.在这种情况下,您不想要上面的in
表达式,则需要重命名该参数.而且find
没有用,您只想直接调用occurences
.您想像这样循环调用它:
I suspect that what you're really looking for is counts of each token. In which case, with your design, both find
and occurrences
should actually take a single token
, not a list of tokens
as an argument. In which case you don't want the in
expression above, you want to rename the parameter. And you have no use for find
, you want to just call occurences
directly. And you want to call it in a loop, like this:
for word in words:
count = occurences(word, words)
print('{}: {}'.format(word, count))
并且,就像您的其他两个功能是已内置的功能(str.translate
和lower
)一样,该功能也是如此:list.count
.如果您应该出于学习目的自己编写它,那很好,但是如果这不是作业的一部分,则只需使用内置函数即可.
And, just as your other two functions were reproducing functions already built in (str.translate
and lower
), this one is too: list.count
. If you were supposed to write it yourself for learning purposes, that's fine, but if that's not part of the assignment, just use the built-in function.
这篇关于计算数组python中项目的出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!