从与多个键匹配的json文件中删除重复项 [英] Remove duplicates from json file matching against multiple keys

查看:73
本文介绍了从与多个键匹配的json文件中删除重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

原始帖子= 从json数据中删除重复项

这只是我的第二篇文章.我没有足够的分数来评论我在原始帖子上的问题...所以我在这里.

This is only my second post. I didnt have enough points to comment my question on the original post...So here I am.

安迪·海登(Andy Hayden)提出了一个很好的观点-而且,这些并不是真正的重复... –安迪·海登(Andy Hayden)"

Andy Hayden makes a great point - "Also, those aren't really duplicates... – Andy Hayden"

我的问题就是这种情况...如何从json文件中删除重复项,但要匹配json文件中的多个键呢?

My question is just that situation... How can you remove duplicates from a json file but by matching against more than 1 key in the json file?

这是原始示例:(有人指出这不是有效的json)

Here is the original example: (it was pointed out that it is not a valid json)

{
  {obj_id: 123,
    location: {
      x: 123,
      y: 323,
  },
  {obj_id: 13,
    location: {
      x: 23,
      y: 333,
  },
 {obj_id: 123,
    location: {
      x: 122,
      y: 133,
  },
}

我的情况与本示例非常相似,但在我的情况下,它将保留所有这些,因为obj_id的x和y值是唯一的,但是如果x和y相同,则将从json文件中删除.

My case is very similar to this example except In my case, it would keep all these because the x and y values of obj_id are unique, however if x and y were the same than one would be removed from json file.

我发现的所有示例仅基于一次关键匹配就将其踢出.

All the examples I have found only kick out ones based on only one key match..

我不知道这是否重要,但是我需要匹配的键是"Company Name","First Name"和"Last Name"(这是公司和联系人的100k加行json-有时同一个人是多个公司的联系人,这就是为什么我需要使用多个键来匹配的原因)

I don't know if it matters, but the keys that I need to match against are "Company Name" , "First Name", and "Last Name" (it is a 100k plus line json of companies and contacts - there are times when the same person is a contact of multiple companies which is why I need to match against multiple keys)

谢谢.

推荐答案

我希望这能满足您的需求(它只会检查名字和姓氏是否不同)

I hope this does what you are looking for (It only checks if First and Last Name are different)

raw_data = [
        {
            "Company":123,
            "Person":{
                "First Name":123,
                "Last Name":323
            }
        },
        {
            "Company":13,
            "Person":{
                "First Name":123,
                "Last Name":323
            }
        },
        {
            "Company":123,
            "Person":{
                "First Name":122,
                "Last Name":133
            }
        }
    ]

unique = []
for company in raw_data:
    if all(unique_comp["Person"] != company["Person"] for unique_comp in unique):
        unique.append(company)

print(unique)

#>>> [{'Company': 123, 'Person': {'First Name': 123, 'Last Name': 323}}, {'Company': 123, 'Person': {'First Name': 122, 'Last Name': 133}}]

这篇关于从与多个键匹配的json文件中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆