查找重复项的索引 [英] Find index of item with duplicates

查看:72
本文介绍了查找重复项的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多重复项的列表,如何找到数组中所有重复项的索引.所以基本上我搜索一个数据项,如果它有重复项.它打印出找到该项目的索引,包括重复项所在的位置

I have a list which has many duplicates in, how can I find the index of all the duplicates in the array. So basically I search for a data item and if it has duplicates. It prints out the indexes of where the item is found, including where the duplicates are

推荐答案

如果列表中的项目是可哈希的,则可以将它们用作字典中的键:

If the items in the list are hashable, you could use them as keys in a dict:

import collections

somelist = list('ABRACADABRA')
dups = collections.defaultdict(list)
for index, item in enumerate(somelist):
    dups[item].append(index)
print(dups)

收益

defaultdict(<type 'list'>, {'A': [0, 3, 5, 7, 10], 'R': [2, 9], 'B': [1, 8], 'C': [4], 'D': [6]})


如果项目不可哈希(例如列表),那么下一个最佳解决方案是定义key函数(如果可能),该函数将每个项目映射到唯一的可哈希对象(例如元组):


If the items are not hashable (such as a list), then the next best solution is to define a key function (if possible) which maps each item to a unique hashable object (such as a tuple):

def key(item):
    return something_hashable
for index, item in enumerate(somelist):
    dups[key(item)].append(index)


如果找不到这样的key,则必须将可见项存储在列表中,并通过测试与可见对象列表中每个项的相等性来测试重复项.这是O(n ** 2).


If no such key can be found, you'd have to store the seen items in a list, and test for duplicates by testing equality with each item in the list of seen objects. This is O(n**2).

# Don't use this unless somelist contains unhashable items
import collections
somelist = list('ABRACADABRA')
seen = []
dups = collections.defaultdict(list)
for i, item in enumerate(somelist):
    for j, orig in enumerate(seen):
        if item == orig:
            dups[j].append(i)
            break
    else:
        seen.append(item)
print([(seen[key], val) for key, val in dups.iteritems()])

收益

[('A', [3, 5, 7, 10]), ('B', [8]), ('R', [9])]

这篇关于查找重复项的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆