从与多个键匹配的json文件中删除重复项 [英] Remove duplicates from json file matching against multiple keys
问题描述
原始帖子= 从json数据中删除重复项
这只是我的第二篇文章.我没有足够的分数来评论我在原始帖子上的问题...所以我在这里.
This is only my second post. I didnt have enough points to comment my question on the original post...So here I am.
安迪·海登(Andy Hayden)提出了一个很好的观点-而且,这些并不是真正的重复... –安迪·海登(Andy Hayden)"
Andy Hayden makes a great point - "Also, those aren't really duplicates... – Andy Hayden"
我的问题就是这种情况...如何从json文件中删除重复项,但要匹配json文件中的多个键呢?
My question is just that situation... How can you remove duplicates from a json file but by matching against more than 1 key in the json file?
这是原始示例:(有人指出这不是有效的json)
Here is the original example: (it was pointed out that it is not a valid json)
{
{obj_id: 123,
location: {
x: 123,
y: 323,
},
{obj_id: 13,
location: {
x: 23,
y: 333,
},
{obj_id: 123,
location: {
x: 122,
y: 133,
},
}
我的情况与本示例非常相似,但在我的情况下,它将保留所有这些,因为obj_id的x和y值是唯一的,但是如果x和y相同,则将从json文件中删除.
My case is very similar to this example except In my case, it would keep all these because the x and y values of obj_id are unique, however if x and y were the same than one would be removed from json file.
我发现的所有示例仅基于一次关键匹配就将其踢出.
All the examples I have found only kick out ones based on only one key match..
我不知道这是否重要,但是我需要匹配的键是"Company Name","First Name"和"Last Name"(这是公司和联系人的100k加行json-有时同一个人是多个公司的联系人,这就是为什么我需要使用多个键来匹配的原因)
I don't know if it matters, but the keys that I need to match against are "Company Name" , "First Name", and "Last Name" (it is a 100k plus line json of companies and contacts - there are times when the same person is a contact of multiple companies which is why I need to match against multiple keys)
谢谢.
推荐答案
我希望这能满足您的需求(它只会检查名字和姓氏是否不同)
I hope this does what you are looking for (It only checks if First and Last Name are different)
raw_data = [
{
"Company":123,
"Person":{
"First Name":123,
"Last Name":323
}
},
{
"Company":13,
"Person":{
"First Name":123,
"Last Name":323
}
},
{
"Company":123,
"Person":{
"First Name":122,
"Last Name":133
}
}
]
unique = []
for company in raw_data:
if all(unique_comp["Person"] != company["Person"] for unique_comp in unique):
unique.append(company)
print(unique)
#>>> [{'Company': 123, 'Person': {'First Name': 123, 'Last Name': 323}}, {'Company': 123, 'Person': {'First Name': 122, 'Last Name': 133}}]
这篇关于从与多个键匹配的json文件中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!