计算数组python中项目的出现 [英] counting occurrences of items in an array python

查看:84
本文介绍了计算数组python中项目的出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该程序的目的是读取文件,将所有单词更改为单独的标记,然后将这些标记放入数组中.然后,程序将删除所有标点符号,并将所有字母更改为小写字母.然后,程序应计算每个命令行参数在数组中出现的次数,并打印结果.我的程序能够成功创建一个数组,这些数组包含多个标点符号,小写字母.我现在的问题是如何遍历数组并计算特定单词的出现次数,以及如何在主函数中调用这些函数.我的拆线功能按书面规定工作

The purpose of this program is to read in a file, change all the words into individual tokens, and place these tokens into an array. The program then removes all punctuation and changes all letters to lowercase. Then the program should count how many times each command line argument occurs in the array, and print the result. My program is able to successfully create an array of depunctuated, lowercase tokens. My problem now is how to loop through the array and count the occurrences of a particular word, and how I should call these functions in the main function. My depunctuate function works as written

这是我的程序:

import sys
from scanner import *

def main():
    print("the name of the program is",sys.argv[0])
    for i in range(1,len(sys.argv),1):
        print("   argument",i,"is", sys.argv[i])
    tokens = readTokens("text.txt")
    cleanTokens = depunctuateTokens(tokens)
    words = [token.lower() for token in cleanTokens]
    count = find(words)
    print(words)
    print(count)
def readTokens(s):
    arr=[]
    s=Scanner("text.txt")
    token=s.readtoken()
    while (token != ""):
        arr.append(token)
        token=s.readtoken()
    s.close()
    return arr

def depunctuateTokens(arr):
    result=[]
    for i in range(0,len(arr),1):
        string=arr[i]
        cleaned=""
        punctuation="""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
        for i in range(0,len(string),1):
            if string[i] not in punctuation:
                cleaned += string[i]
        result.append(cleaned)
    return result

def find(tokens,words):
    return occurences(tokens,words)>0

def occurences(tokens,words):
    count = 0
    for i in range(0,len(words),1):
        if (words[i] == tokens):
            count += 1
        return count

main()

推荐答案

您现有的功能距离不太远:

Your existing function isn't too far off:

def occurences(tokens,words):
    count = 0
    for i in range(0,len(words),1):
        if (words[i] == tokens):
            count += 1
        return count


第一个问题是您已缩进for循环内的return count.这意味着它将在每次循环时都return,这意味着它将仅处理第一个单词.因此,如果第一个单词匹配,它将返回1,否则返回0.只是不确定return,问题就消失了.


The first problem is that you've indented the return count inside the for loop. That means it will return each time through the loop, which means it will only ever process the first word. So, it will return 1 if the first word matches, 0 otherwise. Just unindent that return and that problem goes away.

第二个问题是,根据参数的名称判断,您期望tokenswords都是字符串列表.因此,单个单词words[i]永远不会匹配整个标记列表.也许您想测试该单词是否与列表中的所有标记匹配,而不是与列表中的所有标记匹配?在这种情况下,您会写:

The second problem is that, judging by the names of the parameters, you're expecting both tokens and words to be lists of strings. So, a single word words[i] is never going to match a whole list of tokens. Maybe you wanted to test whether that word matches any of the tokens in the list, instead of whether it matches the list? In that case, you'd write:

if words[i] in tokens:


最后,虽然您的find函数似乎正确调用了occurences(嗯,您拼写了occurrences错了,但是您一直这样做,所以没关系),但实际上您并没有正确地调用find,所以你永远不会到这里.您的通话看起来像这样:


Finally, while your find function seems to call occurences properly (well, you spelled occurrences wrong, but you did so consistently, so that's OK), you don't actually call find properly, so you'll never get here. Your call looks like this:

count = find(words)

…但是您的定义是这样的:

… but your definition like this:

def find(tokens,words):

您必须将某物传递给该tokens参数.我不确定要通过什么,但是您是设计和编写此代码的人.您为该函数编写了什么?

You have to pass something to that tokens parameter. I'm not sure what to pass—but you're the one who designed and wrote this code; what did you write the function for?

我怀疑您真正在寻找的是每个令牌的数量.在这种情况下,对于您的设计,findoccurrences实际上都应使用单个token,而不是tokens的列表作为参数.在这种情况下,您想要上面的in表达式,则需要重命名该参数.而且find没有用,您只想直接调用occurences.您想像这样循环调用它:

I suspect that what you're really looking for is counts of each token. In which case, with your design, both find and occurrences should actually take a single token, not a list of tokens as an argument. In which case you don't want the in expression above, you want to rename the parameter. And you have no use for find, you want to just call occurences directly. And you want to call it in a loop, like this:

for word in words:
    count = occurences(word, words)
    print('{}: {}'.format(word, count))

并且,就像您的其他两个功能是已内置的功能(str.translatelower)一样,该功能也是如此:list.count.如果您应该出于学习目的自己编写它,那很好,但是如果这不是作业的一部分,则只需使用内置函数即可.

And, just as your other two functions were reproducing functions already built in (str.translate and lower), this one is too: list.count. If you were supposed to write it yourself for learning purposes, that's fine, but if that's not part of the assignment, just use the built-in function.

这篇关于计算数组python中项目的出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆