如何从出现在列表列表中的单词列表中返回单词计数? [英] How to return the count of words from a list of words that appear in a list of lists?

查看:106
本文介绍了如何从出现在列表列表中的单词列表中返回单词计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多这样的字符串列表:

I have a very large list of strings like this:

list_strings = ['storm', 'squall', 'overcloud',...,'cloud_up', 'cloud_over', 'plague', 'blight', 'fog_up', 'haze']

以及大量这样的列表:

lis_of_lis = [['the storm was good blight'],['this is overcloud'],...,[there was a plague stormicide]]

如何返回lis_of_lis的每个子列表中出现在list_strings中的所有单词的计数列表.例如,对于上面的示例,这将是所需的输出:[2,1,1]

How can I return a list of counts of all the words that appear in list_strings on each sub-list of lis_of_lis. For instance for the above example this will be the desired output: [2,1,1]

例如:

['storm', 'squall', 'overcloud',...,'cloud_up', 'cloud_over', 'plague', 'blight', 'fog_up', 'haze']

['the storm was good blight']

计数为2,因为stormblight出现在第一个子列表(lis_of_lis)

The count is 2, since storm and blight appear in the first sublist (lis_of_lis)

['storm', 'squall', 'overcloud',...,'cloud_up', 'cloud_over', 'plague', 'blight', 'fog_up', 'haze']

['this is overcloud stormicide']

计数为1,因为overcloud出现在第一个子列表(lis_of_lis)

The count is 1, since overcloud appear in the first sublist (lis_of_lis)

因为杀虫剂未出现在第一列表中

['storm', 'squall', 'overcloud',...,'cloud_up', 'cloud_over', 'plague', 'blight', 'fog_up', 'haze']

[there was a plague]

计数为1,因为plague出现在第一个子列表(lis_of_lis)

The count is 1, since plague appear in the first sublist (lis_of_lis)

因此是所需的输出[2,1,1]

所有答案的问题在于,要计算单词中的所有子字符串而不是整个单词

推荐答案

result = []
for sentence in lis_of_lis:
    result.append(0)
    for word in list_strings:
        if word in sentence[0]:
            result[-1]+=1
print(result)

result = [sum(1 for word in list_strings if word in sentence[0])  for sentence in lis_of_lis]

这将为您的示例返回[2,2,1].

This will return [2,2,1] for your example.

如果只需要整个单词,请在单词/句子前后添加空格:

If you want only whole words, add spaces before and after the words / sentences:

result = []
for sentence in lis_of_lis:
    result.append(0)
    for word in list_strings:
        if ' '+word+' ' in ' '+sentence[0]+' ':
            result[-1]+=1
print(result)

或简短版本:

result = [sum(1 for word in list_strings if ' '+word+' ' in ' '+sentence[0]+' ')  for sentence in lis_of_lis]

这将为您的示例返回[2,1,1].

This will return [2,1,1] for your example.

这篇关于如何从出现在列表列表中的单词列表中返回单词计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆