无法使用列表理解或Frozenset删除列表中的重复字典 [英] Unable to remove duplicate dicts in list using list comprehension or frozenset

查看:37
本文介绍了无法使用列表理解或Frozenset删除列表中的重复字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想删除列表中的重复字典.

I would like to remove duplicate dicts in list.

具体来说,如果两个字典在paper_title键下的内容相同,则保留其中一个并删除另一个重复项.

Specifically, if two dict having the same content under the key paper_title, maintain one and remove the other duplicate.

例如,给出下面的列表

test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
             {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
             {"paper_title": 'Unique One', 'Paper_year': 3}, \
             {"paper_title": 'Unique two', 'Paper_year': 3}]

它应该返回

return_value = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
             {"paper_title": 'Unique One', 'Paper_year': 3}, \
             {"paper_title": 'Unique two', 'Paper_year': 3}]

根据教程,它可以是使用列表理解或Frozenet实现.这样的

According to the tutorial, this can be achieved using list comprehension or frozenet. Such that

test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
             {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
             {"paper_title": 'Unique One', 'Paper_year': 3}, \
             {"paper_title": 'Unique two', 'Paper_year': 3}]


return_value= [i for n, i in enumerate(test_list) if i not in test_list[n + 1:]]

但是,它不返回重复项

return_value = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
                 {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
                 {"paper_title": 'Unique One', 'Paper_year': 3}, \
                 {"paper_title": 'Unique two', 'Paper_year': 3}]

我可以知道,我应该更改代码的哪一部分?

May I know, which part of the code, I should change?

此外,还有没有更快的方法来获得相似的结果?

Also, is there any more faster way to achieve similar result?

推荐答案

这是因为您的示例 dict 严格都是 .如果将 Paper_year 更改为相同,它将按预期工作:

It is because your sample dicts are strictly all different. If you change Paper_year to same, it works as expected:

test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 3}, \ # Change 2 to 3
             {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
             {"paper_title": 'Unique One', 'Paper_year': 3}, \
             {"paper_title": 'Unique two', 'Paper_year': 3}]

[i for n, i in enumerate(test_list) if i not in test_list[n + 1:]]
#[{'Paper_year': 3, 'paper_title': 'This is duplicate'},
# {'Paper_year': 3, 'paper_title': 'Unique One'},
# {'Paper_year': 3, 'paper_title': 'Unique two'}]

使用 itertools.groupby 来获得预期输出的一种方法:

One way to achieve the expected output using itertools.groupby:

from itertools import groupby

f = lambda x: x["paper_title"]
[next(g) for k, g in groupby(sorted(test_list, key=f),key=f)]

输出:

[{'Paper_year': 2, 'paper_title': 'This is duplicate'},
 {'Paper_year': 3, 'paper_title': 'Unique One'},
 {'Paper_year': 3, 'paper_title': 'Unique two'}]

这篇关于无法使用列表理解或Frozenset删除列表中的重复字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆