计算列表中最常出现的单词 [英] Counting the most frequent word in a list

查看:43
本文介绍了计算列表中最常出现的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个程序,如果用户输入名称,它将返回使用该名称的人数.如果他们输入"most",它将返回使用最多的名称.我以某种方式得到了对单个单词的计数,但是我不确定如何使程序定义和计数最频繁出现的单词.

I am trying to construct a program that if a user enters a name, it will return the number of people with that name. If they type 'most', it will return the name used the most. I got the counting individual words part somehow, but I am not sure how to make the program define and count the most frequently appearing word.

import re
from collections import Counter

data = ('Billy Bob', 'Misty', 'Leroy', 'Leroy', 'Leroy', 'Billy Bob', 'Betty Sue',
        'Billy Bob', 'Betty Sue', 'Misty', 'Betty Sue', 'Betty Sue',
        'Misty', 'Betty Sue', 'Horace', 'Misty', 'Betty Sue', 'Misty',
        'Leroy', 'Betty Sue', 'Misty', 'Doug', 'Misty', 'Wilma', 'Jesse',
        'Misty', 'Billy Bob', 'Betty Sue', 'Betty Sue', 'Leroy', 'Misty',
        'Leroy', 'Jesse Jr', 'Betty Sue', 'Betty Sue', 'Misty', 'Misty',
        'Misty', 'Betty Sue', 'Misty', 'Misty', 'Misty', 'Leroy', 'Leroy',
        'Bailey', 'Peggy', 'Leroy', 'Billy Bob', 'Leroy', 'Leroy', 'Misty',
        'Paris', 'Leroy', 'Leroy', 'Misty Mae', 'Leroy', 'Misty', 'Leroy',
        'Bart', 'Big Daddy', 'Betty Sue', 'Billy Bob', 'Betty Sue',
        'LeeAnne', 'Billy Bob', 'Leroy', 'Betty Sue', 'Leroy', 'Betty Sue',
        'Misty', 'Rowdy', 'Billy Bob', 'Ricky', 'Misty', 'Billy Bob', 'Billy
        Bob', 'Billy Bob', 'EvaSue', 'Mark', 'Betty Sue', 'Leroy', 'Betty
        Sue', 'Billy Bob', 'Leroy', 'Leroy', 'Billy Bob', 'Billy Bob',
        'Billy Bob', 'Billy Bob', 'Billy Bob', 'Misty', 'Rob', 'Betty Sue',
        'SuelySue', 'Billy Bob', 'Misty', 'Betty Sue', 'Misty', 'Billy Bob',
        'Betty Sue', 'Leroy', 'Misty', 'Billy Bob', 'Misty', 'Misty', 'Billy
        Bob', 'Billy Bob', 'Billy Bob', 'Billy Bob', 'Leroy', 'Jesse Jr Jr',
        'Billy Bob', 'Grady', 'Leroy', 'Billy Bob', 'Leroy', 'Billy Bob',
        'Betty Sue', 'Billy Bob', 'Misty', 'Louise', 'Leroy', 'Betty Sue',
        'Leroy', 'Betty Sue', 'Leroy', 'Betty Sue', 'Betty Sue', 'Billy
        Bob', 'Leroy', 'Jenny Jae', 'Misty', 'Betty Sue', 'Billy Bob',
        'Leroy', 'Billy Bob', 'Jesse', 'Misty', 'Misty', 'Leroy', 'Betty
        Sue', 'BJ', 'Misty', 'Leroy', 'Boris', 'Misty', 'Billy Bob', 'Pegs',
        'Misty', 'Leslie', 'James', 'Melvin', 'Misty', 'Betty Sue', 'Mary
        Beth', 'Billy Bob', 'Betty Sue', 'Billy Bob', 'Misty', 'Betty Sue',
        'Leroy', 'Billy Bob', 'Billy Bob', 'BethAnne', 'Leroy', 'Betty Sue',
        'Bett', 'Billy Bob', 'Misty', 'Misty', 'Billy Bob', 'Leroy', 'Billy
        Bob', 'Billy Bob', 'Misty', 'Billy Bob', 'Raina', 'Betty Sue',
        'Misty', 'Misty', 'Misty', 'Betty Sue', 'Mikey', 'Betty Sue', 'Billy
        Bob', 'Misty', 'Betty Sue', 'Leroy', 'Betty Sue', 'Billy Bob',
        'Betty Sue', 'Billy Bob', 'Betty Sue', 'Louise Jr', 'Billy Bob',
        'Misty', 'Leroy', 'Leroy', 'Billy Bob', 'Billy Bob', 'Misty',
        'Leroy', 'Leroy', 'Leroy', 'Billy Bob', 'Betty Sue', 'Billy Bob',
        'Betty Sue', 'Misty', 'Betty Sue', 'Betty Sue', 'Misty', 'Betty
        Sue', 'Horace', 'Misty', 'Betty Sue', 'Misty', 'Leroy', 'Betty Sue',
        'Misty', 'Doug', 'Misty', 'Wilma', 'Jesse', 'Misty', 'Billy Bob',
        'Betty Sue', 'Betty Sue', 'Leroy', 'Misty', 'Leroy', 'Jesse Jr',
        'Betty Sue', 'Betty Sue', 'Misty', 'Misty', 'Misty', 'Betty Sue',
        'Misty', 'Misty', 'Misty', 'Leroy', 'Leroy', 'Bailey', 'Peggy',
        'Leroy', 'Billy Bob', 'Leroy', 'Leroy', 'Misty', 'Paris', 'Leroy',
        'Leroy', 'Misty Mae', 'Leroy', 'Misty', 'Leroy', 'Bart', 'Big
        Daddy', 'Betty Sue', 'Billy Bob', 'Betty Sue', 'LeeAnne', 'Billy
        Bob', 'Leroy', 'Betty Sue', 'Leroy', 'Betty Sue', 'Misty', 'Rowdy',
        'Billy Bob', 'Ricky', 'Misty', 'Billy Bob', 'Billy Bob', 'Billy
        Bob', 'EvaSue', 'Mark', 'Betty Sue', 'Leroy', 'Betty Sue', 'Billy
        Bob', 'Leroy', 'Leroy', 'Billy Bob', 'Billy Bob', 'Billy Bob',
        'Billy Bob', 'Billy Bob', 'Misty', 'Rob', 'Betty Sue', 'SuelySue',
        'Billy Bob', 'Misty', 'Betty Sue', 'Misty', 'Billy Bob', 'Betty
        Sue', 'Leroy', 'Misty', 'Billy Bob', 'Misty', 'Misty', 'Billy Bob',
        'Billy Bob', 'Billy Bob', 'Billy Bob', 'Leroy', 'Jesse Jr Jr',
        'Billy Bob', 'Grady', 'Leroy', 'Billy Bob', 'Leroy', 'Billy Bob',
        'Betty Sue', 'Billy Bob', 'Misty', 'Louise', 'Leroy', 'Betty Sue',
        'Leroy', 'Betty Sue', 'Leroy', 'Betty Sue', 'Betty Sue', 'Billy
        Bob', 'Leroy', 'Jenny Jae', 'Misty', 'Betty Sue', 'Billy Bob',
        'Leroy', 'Billy Bob', 'Jesse', 'Misty', 'Misty', 'Leroy', 'Betty
        Sue', 'BJ', 'Misty', 'Leroy', 'Boris', 'Misty', 'Billy Bob', 'Pegs',
        'Misty', 'Leslie', 'James', 'Melvin', 'Misty', 'Betty Sue', 'Mary
        Beth', 'Billy Bob', 'Betty Sue', 'Billy Bob', 'Misty', 'Betty Sue',
        'Leroy', 'Billy Bob', 'Billy Bob', 'BethAnne', 'Leroy', 'Betty Sue',
        'Bett', 'Billy Bob', 'Misty', 'Misty', 'Billy Bob', 'Leroy', 'Billy
        Bob', 'Billy Bob', 'Misty', 'Billy Bob', 'Raina', 'Betty Sue',
        'Misty', 'Misty', 'Misty', 'Betty Sue', 'Mikey', 'Betty Sue', 'Billy
        Bob', 'Misty', 'Betty Sue', 'Leroy', 'Betty Sue', 'Billy Bob',
        'Betty Sue', 'Billy Bob', 'Betty Sue', 'Louise Jr', 'Billy Bob',
        'Misty', 'Leroy', 'Leroy', 'Billy Bob', 'Billy Bob', 'Betty Sue')

print('''Welcome to the White Valley Name Counter. Enter a name, or "most" to see what name is the most used in this great city!''')

print()
keepgoing = 'y'

while keepgoing == 'y':
    count = 0
    search = input("What name do you want to search for in White Valley database? ")
    print()
    data_list = list(data)
    if search != "most":
        print("There are {} people named {}".format(data_list.count(search),search))
        print()
    elif search == "most":
        print("{} is the most common. There are {} of them".format(
                data_list.count.most_common(data_list), search))
    keepgoing = input('''Want to search another name ("y" for yes)? ''')
    print()

我正在尝试使输出看起来像这样:

I am trying to make the output look like this:

Welcome to the White Valley Name Counter. Enter a name, or "most" to see what name is the most used in this great city!

What name do you want to search for in White Valley database? john

There are 0 people named john

Want to search another name ("y" for yes)? y

What name do you want to search for in White Valley database? Betty Sue

There are 79 people named Betty Sue

Want to search another name ("y" for yes)? y

What name do you want to search for in White Valley database? most

Billy Bob is most common. There are 93 of them

Want to search another name ("y" for yes)? n

推荐答案

很容易计算用户指定的名称出现的次数!

Well to count the number of times a name occurs that is specified by the user is pretty easy!

让我们写一个小函数来处理它并返回结果.

Let's write a little function to handle that and return the result.

names = ("billy","bob","pete","bob",'pete','bob');


def count_my_name(name):
    return ("The name %s occurs %s times." % (name,str(names.count(name))));

如果我们使用名称pete打印此结果,则会得到以下结果:

If we print this result with the name pete we would get the following result:

名称pete出现了两次.

现在要计算列表中最常用的名称,我们可以编写另一个简洁的小函数来处理它并为我们返回结果.

Now for counting the most common name in the list, we can write another neat little function to handle that and return the result for us.

names = ("billy","bob","pete","bob",'pete','bob');
def get_most_common_name():
    all_names = set(names);
    most_common = max([names.count(i) for i in all_names]);
    for name in all_names:
        if names.count(name) == most_common:
            return ("The most common name is %s occuring a total of %s times." % (name,str(most_common)));

哪个会给我们结果:最常见的名字是bob,总共出现3次.

好吧,现在对我们的第二个功能进行一些解释,我们在这里实际上在做什么?

Okay so now some explanation for our second function, what are we actually doing here?

首先,我们获取元组命名的名称,其中包含名称,但是其中一些是重复的,我们不想多次遍历同一名称.因此,创建一个名为 all_names 的新变量,并从该列表中进行设置.

Well first we grab our tuple named names, it has names in it but some of them are duplicates and we don't want to iterate over the same name multiple times. So make a new variable called all_names and make a set out of that list.

集合很有用,因为它们会为我们删除所有重复项.

sets are usefull since they will remove any duplicates for us.

现在,我们可以使用以下方法计算名称在名称中出现的次数:

Now we can count the number of times a name occurs in names using:

most_common = max([在所有名称中,i的names.count(i)]);

这为我们提供了在元组中出现次数最多的名称的编号.就是3.

This gives us the number of the name that occurs the most inside our tuple. Which would be 3.

现在,我们只需要遍历我们设置的 all_names 并计算该名称在名称中出现的次数.

Now we simply just iterate over our set all_names and count how many times that name occurs in names.

如果名称在名称中的出现次数与 most_common 变量一样多,则我们使用的名称最多.

If the name occurs as many times in names as our most_common variable we have the name that occurs the most.

希望这会有所帮助!

这篇关于计算列表中最常出现的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆