Python:如何获取仅出现在一组列表中的项目? [英] Python: How to get items that appear in only one set of a list of sets?

查看:49
本文介绍了Python:如何获取仅出现在一组列表中的项目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个函数,它接受一个或多个集合的列表,并找到列表中所有集合的对称差,即结果应该是一组值,每个值只包含在一个的个人集.(如果我错了这是对称差异,请纠正我.)

例如:

<预><代码>>>>s1 = set([1, 2, 3])>>>s2 = set([2, 3, 4])>>>s3 = set([2, 3, 7])>>>s4 = set([2, 5, 9])>>>myfunc([s1, s2, s3, s4]){1, 4, 5, 7, 9}

是否有内置的东西可以代替 myfunc 使用?或者我使用这样的东西:

def myfunc(sets: List[set]) ->放:sd = 设置()戈纳斯 = 设置()对于 s 集合:still_ok = s - gorssd = sd.symmetric_difference(still_ok)gors = gors.union(s.difference(sd))返回 sd

有没有更好/更有效/Pythonic"的方式来做到这一点?

解决方案

对于可以使用运算符和函数完成的内置 Python 对象的操作,运算符版本通常比函数版本快,因为在访问实例属性并进行显式函数调用.此外,对集合执行就地更新可以避免创建额外的数据副本并提高程序效率.

使用集合运算符的改进版本如下所示:

def myfunc_improved(sets: List[set]) ->放:sd = 设置()戈纳斯 = 设置()对于 s 集合:sd ^= s - 走了gors |= s - sd返回 sd

性能测量:

%timeit myfunc(sets)%timeit myfunc_improved(sets)每个循环 3.19 µs ± 34.3 ns(7 次运行的平均值 ± 标准偏差,每次 100000 次循环)每个循环 1.75 µs ± 11.5 ns(7 次运行的平均值 ± 标准偏差,每次 1000000 次循环)

I want to create a function that takes a list of one or more sets and finds the symmetric difference of all of the sets in the list, i.e. the result should be a set of values, each of which is contained in only one of the individual sets. (Please correct me if I'm wrong about this being the symmetrical difference.)

For example:

>>> s1 = set([1, 2, 3])
>>> s2 = set([2, 3, 4])
>>> s3 = set([2, 3, 7])
>>> s4 = set([2, 5, 9])
>>> myfunc([s1, s2, s3, s4])
{1, 4, 5, 7, 9}

Is there something built in that could be used above in place of myfunc? Or do I use something like this:

def myfunc(sets: List[set]) -> set:

    sd = set()
    goners = set()
    for s in sets:
        still_ok = s - goners
        sd = sd.symmetric_difference(still_ok)
        goners = goners.union(s.difference(sd))
    return sd

Is there a better/more efficient/"Pythonic" way to do this?

解决方案

For operations on built-in Python objects that can be done using both operators and functions, the operator versions are generally faster than the function versions since there is overhead in accessing instance attributes and making explicit function calls. Also, performing in-place updates on collections can avoid creating extra copies of data and makes the program more efficient.

An improved version of your approach using set operators looks like this:

def myfunc_improved(sets: List[set]) -> set:
    sd = set()
    goners = set()
    for s in sets:
        sd ^= s - goners
        goners |= s - sd
    return sd

Performance measurements:

%timeit myfunc(sets)
%timeit myfunc_improved(sets)

3.19 µs ± 34.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1.75 µs ± 11.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

这篇关于Python:如何获取仅出现在一组列表中的项目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆