无法使用列表理解或Frozenset删除列表中的重复字典 [英] Unable to remove duplicate dicts in list using list comprehension or frozenset
问题描述
我想删除列表中的重复字典.
I would like to remove duplicate dicts in list.
具体来说,如果两个字典在paper_title键下的内容相同,则保留其中一个并删除另一个重复项.
Specifically, if two dict having the same content under the key paper_title, maintain one and remove the other duplicate.
例如,给出下面的列表
test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
{"paper_title": 'This is duplicate', 'Paper_year': 3}, \
{"paper_title": 'Unique One', 'Paper_year': 3}, \
{"paper_title": 'Unique two', 'Paper_year': 3}]
它应该返回
return_value = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
{"paper_title": 'Unique One', 'Paper_year': 3}, \
{"paper_title": 'Unique two', 'Paper_year': 3}]
根据教程,它可以是使用列表理解或Frozenet实现.这样的
According to the tutorial, this can be achieved using list comprehension or frozenet. Such that
test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
{"paper_title": 'This is duplicate', 'Paper_year': 3}, \
{"paper_title": 'Unique One', 'Paper_year': 3}, \
{"paper_title": 'Unique two', 'Paper_year': 3}]
return_value= [i for n, i in enumerate(test_list) if i not in test_list[n + 1:]]
但是,它不返回重复项
return_value = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
{"paper_title": 'This is duplicate', 'Paper_year': 3}, \
{"paper_title": 'Unique One', 'Paper_year': 3}, \
{"paper_title": 'Unique two', 'Paper_year': 3}]
我可以知道,我应该更改代码的哪一部分?
May I know, which part of the code, I should change?
此外,还有没有更快的方法来获得相似的结果?
Also, is there any more faster way to achieve similar result?
推荐答案
这是因为您的示例 dict
严格都是 .如果将 Paper_year
更改为相同,它将按预期工作:
It is because your sample dict
s are strictly all different. If you change Paper_year
to same, it works as expected:
test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 3}, \ # Change 2 to 3
{"paper_title": 'This is duplicate', 'Paper_year': 3}, \
{"paper_title": 'Unique One', 'Paper_year': 3}, \
{"paper_title": 'Unique two', 'Paper_year': 3}]
[i for n, i in enumerate(test_list) if i not in test_list[n + 1:]]
#[{'Paper_year': 3, 'paper_title': 'This is duplicate'},
# {'Paper_year': 3, 'paper_title': 'Unique One'},
# {'Paper_year': 3, 'paper_title': 'Unique two'}]
使用 itertools.groupby
来获得预期输出的一种方法:
One way to achieve the expected output using itertools.groupby
:
from itertools import groupby
f = lambda x: x["paper_title"]
[next(g) for k, g in groupby(sorted(test_list, key=f),key=f)]
输出:
[{'Paper_year': 2, 'paper_title': 'This is duplicate'},
{'Paper_year': 3, 'paper_title': 'Unique One'},
{'Paper_year': 3, 'paper_title': 'Unique two'}]
这篇关于无法使用列表理解或Frozenset删除列表中的重复字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!