最好将项目添加到集合中,或将最终列表转换成集合? [英] Better to add item to a set, or convert final list to a set?

查看:70
本文介绍了最好将项目添加到集合中,或将最终列表转换成集合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些看起来像这样的数据:

I have some data that looks something like this:

ID1 ID2 ID3  
ID1 ID4 ID5  
ID3 ID5 ID7 ID6  
...  
...  

其中每一行都是一组.

我的目标是为每个ID都有一个字典,然后是一组与之共享> = 1组的其他ID.

My goal is to have a dictionary for each ID, followed by a set of the other IDs that share >= 1 group with it.

例如,此数据将返回{ID1:[ID2,ID3,ID4,ID5],ID2:[ID1,ID3] ...}

For example, this data would return {ID1: [ID2, ID3, ID4, ID5], ID2:[ID1, ID3] ... }

我可以想到3种选择,我想知道哪种(通常)是最好的:

I can think of 3 options for this, and I'm wondering which is (generally) best:

  1. 添加ID之前,请检查列表中是否已存在ID
  2. 创建集合而不是列表,并将每个ID添加到集合
  3. 将所有ID添加到列表中,然后将所有列表转换为末尾的集.

推荐答案

TL; DR:与选项2一起使用.只需从一开始就使用集合.

TL;DR: Go with option 2. Just use sets from the start.

在Python中,集合是哈希集,而列表是动态数组.两者均为O(1)插入,但列表是否为元素O(n)以及集合是否为元素O(1)是检查元素是否存在.

In Python, sets are hash-sets, and lists are dynamic arrays. Inserting is O(1) for both, but checking if an element exists is O(n) for the list and O(1) for the set.

因此选项1立即退出.如果您要插入n项,并且每次都需要检查列表,那么总体复杂度将变为O(n^2).

So option 1 is immediately out. If you are inserting n items and need to check the list every time, then the overall complexity becomes O(n^2).

选项2和3总体上在O(n)都是最佳的.在微型基准测试中,选项2可能会更快,因为您不需要在集合之间移动对象.实际上,请选择一种在您的特定情况下易于阅读和维护的选项.

Options 2 and 3 are both optimal at O(n) overall. Option 2 might be faster in micro-benchnarks because you don't need to move objects between collections. In practice, choose the option that is easier to read and maintain in your specific circumstance.

这篇关于最好将项目添加到集合中,或将最终列表转换成集合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆