将 pandas 数据框转换为动态嵌套JSON [英] Pandas dataframe to a dynamic nested JSON

查看:88
本文介绍了将 pandas 数据框转换为动态嵌套JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要创建如下所示的数据框:

I want to create my dataframe which looks like this:

    employeeId  firstName   lastName    emailAddress    isDependent employeeIdTypeCode  entityCode  sourceCode  roleCode
0   E123456 Andrew  Hoover  hoovera@xyz.com False   001 AE  AHR EMPLR
0   102939485   Andrew  Hoover  hoovera@xyz.com False   002 AE  AHR EMPLR
2   E123458 Celeste Riddick riddickc@xyz.com    True    001 AE  AHR EMPLR
2   354852739   Celeste Riddick riddickc@xyz.com    True    002 AE  AHR EMPLR
1   E123457 Curt    Austin  austinc1@xyz.com    True    001 AE  AHR EMPLR
1   675849302   Curt    Austin  austinc1@xyz.com    True    002 AE  AHR EMPLR
3   E123459 Hazel   Tooley  tooleyh@xyz.com False   001 AE  AHR EMPLR
3   937463528   Hazel   Tooley  tooleyh@xyz.com False   002 AE  AHR EMPLR

对于每一行,我想将其转换为嵌套的JSON格式. 因此,我希望每个人的JSON看起来都像这样,因为我想遍历数据框并将其发布到api.

And for each row, I want to convert it into a nested JSON format. So I want my JSON to look something like this for each individual, since I want to iterate over the dataframe and post it to an api.

{  
   "individualInfo":  
      {  
         "individualIdentifier":[  
            {  
               "identityTypeCode":"001",
               "identifierValue":"E123456",
               "profileInfo":{  
                  "firstName":"Andrew",
                  "lastName":"Hoover",
                  "emailAddress":"hoovera@xyz.com"
               }
            },
            {  
               "identityTypeCode":"002",
               "identifierValue":"102939485",
               "profileInfo":{  
                   "firstName":"Andrew",
                  "lastName":"Hoover",
                  "emailAddress":"hoovera@xyz.com"
               }
            }
         ],
         "entityCode":"AE",
         "sourceCode":"AHR",
         "roleCode":"EMPLR"
         "isDependent":False
      }
} 

这里重要的是我希望我的JSON与数据帧上的Id列无关.因此,例如,如果数据框上出现另一个ID,那么我希望该ID具有另一个具有相同配置文件信息的字典对象.因此,每个配置文件可以带有任意数量的Id.

The important thing here is that I want my JSON to be generated agnostic of the Id columns coming on the dataframe. So, if there is, for example, another ID coming on the dataframe, then I want that ID to have another dictionary object with the same profile info. So each profile can have any number of Id with it.

我可以做的代码:

j = (result.groupby(['identifierValue','identityTypeCode'], as_index=False).apply(lambda x: x[['firstName','lastName','emailAddress']].to_dict('r')).reset_index().rename(columns={0:'ProfileInfo'}).to_json(orient='records'))

是否有可能在大熊猫中实现这样的动态? 非常感谢您的帮助!

Would it be possible to achieve something like this dynmically in pandas? Thank you so much for the help!

我找不到嵌套的其他问题:

Few of other questions that I could find for nesting:

将Pandas数据框转换为嵌套JSON

pandas groupby到嵌套的json

这些问题都没有帮助我,因为我希望将数据帧的每个索引都转换为单独的JSON有效负载,因为每个人都将使用我要用于将数据发布到数据库的api服务.

None of these questions are helping me out since I want each index of my dataframe to be converted into an individual JSON payload, as each individual is going to an api service I have for the purpose of posting the data to the database.

推荐答案

听起来最可行的方法是:

It sounds like the most sensible way to pull this off is:

info_dict = df.set_index(['identifierValue', 'identifierValue']).to_dict('index')

然后每次您在JSON中访问profileInfo时,都可以使用相应的('identifierValue','identifierValue')`密钥对引用上述info_dict

Then every time you get to profileInfo in your JSON, you can reference the info_dict above with the appropriate ('identifierValue', 'identifierValue')` key pair

我对所需的格式感到困惑,但这只是一个开始.

I'm confused about what your desired formatting is, but this is a start.

这篇关于将 pandas 数据框转换为动态嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆