计算每个句子的平均单词数 [英] Counting Avg Number of Words Per Sentence

查看:55
本文介绍了计算每个句子的平均单词数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难计算每个句子的单词数.就我而言,我假设句子仅以!" ?" ." >结尾

我有一个看起来像这样的列表:

  [嘿,"!,"如何,"是,"你,"?,"我,"会,"喜欢,"一个,"三明治,."] 

对于上面的示例,计算将为 1 + 3 + 5/3 .不过,我很难做到这一点!有什么想法吗?

解决方案

一个简单的解决方案:

  mylist = [嘿",!",如何",是",您",?",我",会",喜欢",一个",三明治", "."]terminal = set([.",?",!"])#集对于成员资格"测试非常有效terminal_count = 0对于mylist中的项目:如果终端中有项目:#这是我们的会员资格测试terminal_count + = 1avg =(len(mylist)-terminal_count)/float(terminal_count) 

这假设您只关心获得平均值,而不是每个句子的单个计数.

如果想花点时间,可以将 for 循环替换为以下内容:

  terminal_count = sum(如果终端中有项目,则mylist中的项目为1) 

I'm having a bit of trouble trying to count the number of words per sentence. For my case, I'm assuming sentences only end with either "!", "?", or "."

I have a list that looks like this:

["Hey, "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."]

For the example above, the calculation would be 1 + 3 + 5 / 3. I'm having a hard time achieving this, though! Any ideas?

解决方案

A simple solution:

mylist = ["Hey", "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."]
terminals = set([".", "?", "!"]) # sets are efficient for "membership" tests
terminal_count = 0

for item in mylist:
    if item in terminals: # here is our membership test
        terminal_count += 1

avg = (len(mylist) - terminal_count)  / float(terminal_count)

This assumes you only care about getting the average, not the individual counts per sentence.

If you'd like to get a little fancy, you can replace the for loop with something like this:

terminal_count = sum(1 for item in mylist if item in terminals)

这篇关于计算每个句子的平均单词数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆