Python 扁平化嵌套字典 JSON 与 Pandas [英] Python Flatten Multiply Nested Dictionary JSON with Pandas

查看:44
本文介绍了Python 扁平化嵌套字典 JSON 与 Pandas的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理格式类似于下面多嵌套字典的 JSON 响应:

I am working with a JSON response that is formatted like a many-nested dictionary below:

{u'addresses': [],
 u'application_ids': [20855193],
 u'applications': [{u'answers': [{u'answer': u'Indeed ',
                                  u'question': u'How did you hear?'}],
                    u'applied_at': u'2015-10-29T22:19:04.925Z',
                    u'candidate_id': 9999999,
                    u'credited_to': None,
                    u'current_stage': {u'id': 9999999,
                                       u'name': u'Application Review'},
                    u'id': 9999999,
                    u'jobs': [{u'id': 9999999,u'name': u'ENGINEER'}],
                    u'last_activity_at': u'2015-10-29T22:19:04.767Z',
                    u'prospect': False,
                    u'rejected_at': None,
                    u'rejection_details': None,
                    u'rejection_reason': None,
                    u'source': {u'id': 7, u'public_name': u'Indeed'},
                    u'status': u'active'}],
 u'attachments': [{u'filename': u'Jason_Bourne.pdf',
                   u'type': u'resume',
                   u'url': u'https://resumeURL'}],
 u'company': None,
 u'coordinator': {u'employee_id': None,
                  u'id': 9999999,
                  u'name': u'Batman_Robin'},
 u'email_addresses': [{u'type': u'personal',
                       u'value': u'jasonbourne@gmail.com'}],
 u'first_name': u'Jason',
 u'id': 9999999,
 u'last_activity': u'2015-10-29T22:19:04.767Z',
 u'last_name': u'Bourne',
 u'website_addresses': []}

我正在尝试将 JSON 压缩到一个表中,并在 Pandas 文档中找到了以下示例:

I am trying to flatten the JSON into a table and have found the following example on the pandas documentation:

http:///pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.io.json.json_normalize.html

据我了解,record_path"参数指定了您感兴趣的最低级别记录的路径.record_path"参数只能是字符串或字符串列表.但是,要在上面的数据中调用答案"记录,我必须指定字符串 索引如下;

From what I understand, the "record_path" parameter specifies the path of the lowest-level record you are interested in. The "record_path" parameter can only be a string, or list of strings. But, to call the 'answers' records in my data above, I have to specify strings and indexes as follows;

answer = data['applications'][0]['answers']['answer']
question = data['applications'][0]['answers']['question']

如何将上面的记录路径作为参数输入到 json_normalize 函数中?

How can I enter the record paths above as a parameter to the json_normalize function?

谢谢!

推荐答案

我认为你可以使用 record_path 嵌套 list :

I think you can use as record_path nested list:

from pandas.io.json import json_normalize    
df = json_normalize(d, ['applications', ['answers']])
print (df)
    answer           question
0  Indeed   How did you hear?

这篇关于Python 扁平化嵌套字典 JSON 与 Pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆