查找重复项的索引 [英] Find index of item with duplicates
问题描述
我有一个包含许多重复项的列表,如何找到数组中所有重复项的索引.所以基本上我搜索一个数据项,如果它有重复项.它打印出找到该项目的索引,包括重复项所在的位置
I have a list which has many duplicates in, how can I find the index of all the duplicates in the array. So basically I search for a data item and if it has duplicates. It prints out the indexes of where the item is found, including where the duplicates are
推荐答案
如果列表中的项目是可哈希的,则可以将它们用作字典中的键:
If the items in the list are hashable, you could use them as keys in a dict:
import collections
somelist = list('ABRACADABRA')
dups = collections.defaultdict(list)
for index, item in enumerate(somelist):
dups[item].append(index)
print(dups)
收益
defaultdict(<type 'list'>, {'A': [0, 3, 5, 7, 10], 'R': [2, 9], 'B': [1, 8], 'C': [4], 'D': [6]})
如果项目不可哈希(例如列表),那么下一个最佳解决方案是定义key
函数(如果可能),该函数将每个项目映射到唯一的可哈希对象(例如元组):
If the items are not hashable (such as a list), then the next best solution is to define a key
function (if possible) which maps each item to a unique hashable object (such as a tuple):
def key(item):
return something_hashable
for index, item in enumerate(somelist):
dups[key(item)].append(index)
如果找不到这样的key
,则必须将可见项存储在列表中,并通过测试与可见对象列表中每个项的相等性来测试重复项.这是O(n ** 2).
If no such key
can be found, you'd have to store the seen items in a list, and test for duplicates by testing equality with each item in the list of seen objects. This is O(n**2).
# Don't use this unless somelist contains unhashable items
import collections
somelist = list('ABRACADABRA')
seen = []
dups = collections.defaultdict(list)
for i, item in enumerate(somelist):
for j, orig in enumerate(seen):
if item == orig:
dups[j].append(i)
break
else:
seen.append(item)
print([(seen[key], val) for key, val in dups.iteritems()])
收益
[('A', [3, 5, 7, 10]), ('B', [8]), ('R', [9])]
这篇关于查找重复项的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!