带有 pandas 的Python Flatten乘以嵌套字典JSON [英] Python Flatten Multiply Nested Dictionary JSON with Pandas
问题描述
我正在使用以下格式的JSON响应:
I am working with a JSON response that is formatted like a many-nested dictionary below:
{u'addresses': [],
u'application_ids': [20855193],
u'applications': [{u'answers': [{u'answer': u'Indeed ',
u'question': u'How did you hear?'}],
u'applied_at': u'2015-10-29T22:19:04.925Z',
u'candidate_id': 9999999,
u'credited_to': None,
u'current_stage': {u'id': 9999999,
u'name': u'Application Review'},
u'id': 9999999,
u'jobs': [{u'id': 9999999,u'name': u'ENGINEER'}],
u'last_activity_at': u'2015-10-29T22:19:04.767Z',
u'prospect': False,
u'rejected_at': None,
u'rejection_details': None,
u'rejection_reason': None,
u'source': {u'id': 7, u'public_name': u'Indeed'},
u'status': u'active'}],
u'attachments': [{u'filename': u'Jason_Bourne.pdf',
u'type': u'resume',
u'url': u'https://resumeURL'}],
u'company': None,
u'coordinator': {u'employee_id': None,
u'id': 9999999,
u'name': u'Batman_Robin'},
u'email_addresses': [{u'type': u'personal',
u'value': u'jasonbourne@gmail.com'}],
u'first_name': u'Jason',
u'id': 9999999,
u'last_activity': u'2015-10-29T22:19:04.767Z',
u'last_name': u'Bourne',
u'website_addresses': []}
我正在尝试将JSON展平为一个表,并在pandas文档中找到以下示例:
I am trying to flatten the JSON into a table and have found the following example on the pandas documentation:
http://pandas.pydata.org/pandas-docs/version/0.17.0/generation/pandas.io.json.json_normalize.html
据我了解,"record_path"参数指定您感兴趣的最低级别记录的路径."record_path"参数只能是字符串或字符串列表.但是,要调用上面数据中的答案"记录,我必须指定字符串和索引,如下所示;
From what I understand, the "record_path" parameter specifies the path of the lowest-level record you are interested in. The "record_path" parameter can only be a string, or list of strings. But, to call the 'answers' records in my data above, I have to specify strings and indexes as follows;
answer = data['applications'][0]['answers']['answer']
question = data['applications'][0]['answers']['question']
如何将上面的记录路径作为json_normalize函数的参数输入?
How can I enter the record paths above as a parameter to the json_normalize function?
谢谢!
推荐答案
我认为您可以将record_path
嵌套在list
中使用:
I think you can use as record_path
nested list
:
from pandas.io.json import json_normalize
df = json_normalize(d, ['applications', ['answers']])
print (df)
answer question
0 Indeed How did you hear?
这篇关于带有 pandas 的Python Flatten乘以嵌套字典JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!