计算一个元组中列表中所有项目的出现次数 [英] Count occurences of all items of a list in a tuple

查看:457
本文介绍了计算一个元组中列表中所有项目的出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个元组(1,5,2,3,4,5,6,7,3,2,2,4,3)和一个列表[1,2,3],现在想计算该列表中所有项目在该元组中出现的频率(因此它应该返回7).

I have a tuple (1,5,2,3,4,5,6,7,3,2,2,4,3) and a list [1,2,3] and now want to count how often all items of the list occur in the tuple (so it should return 7).

我可以循环列表,对元组中的每一项进行计数,然后对结果求和,但是我敢肯定python中存在更好的可能性.

I could loop the list, count each item in the tuple and then sum up the results, but I bet there is a better possibility in python.

那不是如何计算,因为我明确地说我不仅在要求list.count(item_of_list)(这需要循环执行),而且还要求更好的方法.

Thats not a duplicate of How to count the occurrences of a list item? because I explicitly said that I am not just asking for list.count(item_of_list) (that would need to be done in a loop) but for a better method.

推荐答案

被标记为NumPy,这是一个NumPy解决方案-

Being NumPy tagged, here's a NumPy solution -

In [846]: import numpy as np

In [847]: t = (1,5,2,3,4,5,6,7,3,2,2,4,3)

In [848]: a = [1,2,3]

In [849]: np.in1d(t,a).sum()
Out[849]: 7

# Alternatively with np.count_nonzero for summing booleans
In [850]: np.count_nonzero(np.in1d(t,a))
Out[850]: 7

对于输入中正数元素的特定情况,另一个带有np.bincount的NumPy,基本上使用数字作为bin,然后进行基于bin的求和,索引到具有list元素的元素以获取计数和最终求和最终输出-

Another NumPy one with np.bincount for the specific case of positive numbered elements in the inputs, basically using the numbers as bins, then doing bin based summing, indexing into those with the list elements to get the counts and a final summation for the final output -

In [856]: np.bincount(t)[a].sum()
Out[856]: 7

其他方法-

from collections import Counter
# @Brad Solomon's soln
def collections_counter(tgt, tup):
    counts = Counter(tup)
    return sum(counts[t] for t in tgt)

# @timgeb's soln
def set_sum(l, t):
    l = set(l)
    return sum(1 for x in t if x in l)

# @Amit Tripathi's soln
def dict_sum(l, t):
    dct = {}
    for i in t:
        if not dct.get(i):
            dct[i] = 0
        dct[i] += 1
    return sum(dct.get(i, 0) for i in l)

运行时测试

案例1:具有10,000元素并具有100随机元素列表的元组上的时间-

Case #1 : Timings on a tuple with 10,000 elements and with a list of 100 random elements off it -

In [905]: a = np.random.choice(1000, 100, replace=False).tolist()

In [906]: t = tuple(np.random.randint(1,1000,(10000)))

In [907]: %timeit dict_sum(a, t)
     ...: %timeit set_sum(a, t)
     ...: %timeit collections_counter(a, t)
     ...: %timeit np.in1d(t,a).sum()
     ...: %timeit np.bincount(t)[a].sum()
100 loops, best of 3: 2 ms per loop
1000 loops, best of 3: 437 µs per loop
100 loops, best of 3: 2.44 ms per loop
1000 loops, best of 3: 1.18 ms per loop
1000 loops, best of 3: 503 µs per loop

@timgeb的soln中的

set_sum对于这种输入看起来非常有效.

set_sum from @timgeb's soln looks quite efficient for such inputs.

案例2:具有100,000元素且具有10,000唯一元素且具有1000唯一随机元素列表的元组上的时间-

Case #2 : Timings on a tuple with 100,000 elements that has 10,000 unique elements and with a list of 1000 unique random elements off it -

In [916]: t = tuple(np.random.randint(0,10000,(100000)))

In [917]: a = np.random.choice(10000, 1000, replace=False).tolist()

In [918]: %timeit dict_sum(a, t)
     ...: %timeit set_sum(a, t)
     ...: %timeit collections_counter(a, t)
     ...: %timeit np.in1d(t,a).sum()
     ...: %timeit np.bincount(t)[a].sum()
10 loops, best of 3: 21.1 ms per loop
100 loops, best of 3: 5.33 ms per loop
10 loops, best of 3: 24.2 ms per loop
100 loops, best of 3: 13.4 ms per loop
100 loops, best of 3: 5.05 ms per loop

这篇关于计算一个元组中列表中所有项目的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆