将 pandas 数据框转换为动态嵌套JSON [英] Pandas dataframe to a dynamic nested JSON
问题描述
我要创建如下所示的数据框:
I want to create my dataframe which looks like this:
employeeId firstName lastName emailAddress isDependent employeeIdTypeCode entityCode sourceCode roleCode
0 E123456 Andrew Hoover hoovera@xyz.com False 001 AE AHR EMPLR
0 102939485 Andrew Hoover hoovera@xyz.com False 002 AE AHR EMPLR
2 E123458 Celeste Riddick riddickc@xyz.com True 001 AE AHR EMPLR
2 354852739 Celeste Riddick riddickc@xyz.com True 002 AE AHR EMPLR
1 E123457 Curt Austin austinc1@xyz.com True 001 AE AHR EMPLR
1 675849302 Curt Austin austinc1@xyz.com True 002 AE AHR EMPLR
3 E123459 Hazel Tooley tooleyh@xyz.com False 001 AE AHR EMPLR
3 937463528 Hazel Tooley tooleyh@xyz.com False 002 AE AHR EMPLR
对于每一行,我想将其转换为嵌套的JSON格式. 因此,我希望每个人的JSON看起来都像这样,因为我想遍历数据框并将其发布到api.
And for each row, I want to convert it into a nested JSON format. So I want my JSON to look something like this for each individual, since I want to iterate over the dataframe and post it to an api.
{
"individualInfo":
{
"individualIdentifier":[
{
"identityTypeCode":"001",
"identifierValue":"E123456",
"profileInfo":{
"firstName":"Andrew",
"lastName":"Hoover",
"emailAddress":"hoovera@xyz.com"
}
},
{
"identityTypeCode":"002",
"identifierValue":"102939485",
"profileInfo":{
"firstName":"Andrew",
"lastName":"Hoover",
"emailAddress":"hoovera@xyz.com"
}
}
],
"entityCode":"AE",
"sourceCode":"AHR",
"roleCode":"EMPLR"
"isDependent":False
}
}
这里重要的是我希望我的JSON与数据帧上的Id
列无关.因此,例如,如果数据框上出现另一个ID,那么我希望该ID具有另一个具有相同配置文件信息的字典对象.因此,每个配置文件可以带有任意数量的Id
.
The important thing here is that I want my JSON to be generated agnostic of the Id
columns coming on the dataframe. So, if there is, for example, another ID coming on the dataframe, then I want that ID to have another dictionary object with the same profile info. So each profile can have any number of Id
with it.
我可以做的代码:
j = (result.groupby(['identifierValue','identityTypeCode'], as_index=False).apply(lambda x: x[['firstName','lastName','emailAddress']].to_dict('r')).reset_index().rename(columns={0:'ProfileInfo'}).to_json(orient='records'))
是否有可能在大熊猫中实现这样的动态? 非常感谢您的帮助!
Would it be possible to achieve something like this dynmically in pandas? Thank you so much for the help!
我找不到嵌套的其他问题:
Few of other questions that I could find for nesting:
这些问题都没有帮助我,因为我希望将数据帧的每个索引都转换为单独的JSON有效负载,因为每个人都将使用我要用于将数据发布到数据库的api服务.
None of these questions are helping me out since I want each index of my dataframe to be converted into an individual JSON payload, as each individual is going to an api service I have for the purpose of posting the data to the database.
推荐答案
听起来最可行的方法是:
It sounds like the most sensible way to pull this off is:
info_dict = df.set_index(['identifierValue', 'identifierValue']).to_dict('index')
然后每次您在JSON中访问profileInfo
时,都可以使用相应的('identifierValue','identifierValue')`密钥对引用上述info_dict
Then every time you get to profileInfo
in your JSON, you can reference the info_dict
above with the appropriate ('identifierValue', 'identifierValue')` key pair
我对所需的格式感到困惑,但这只是一个开始.
I'm confused about what your desired formatting is, but this is a start.
这篇关于将 pandas 数据框转换为动态嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!