使用相同的键值从json文件中提取数据框 [英] extracting dataframe out of json file with same key values
本文介绍了使用相同的键值从json文件中提取数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个json文件(如下所示),我需要提取一个数据框,其列名的值应位于配置文件中,其值为"jive_label" > [],并将值作为其值,预期输出如下.我还给出了示例代码,我试图只打印一列,但是该代码显然行不通.
i have a json file(given below), i need to extract a dataframe with column names as value of "jive_label" inside profile[] and values as its values, expected output given below. also i have given the sample code where i try to print just one column, but that code clearly doesnt work.
import requests
import pandas as pd
import numpy as np
from pandas.io.json import json_normalize
import sys
import json
r=requests.get(url,data=payload,headers=headers,params=querystring,verify=False)
json_data=r.json()
df=pd.DataFrame([])
for i in json_data.get('profile',[]):
if i.get('jive_label')=='Title':
dfDict={'Title': i.get('value')
}
df=df.append(pd.DataFrame(dfDICT,index=[0]),ignore_index=True)
print(df.head())*
{
"jive": {
"enabled": true,
"external": false,
"federated": true,
"lastProfileUpdate": "2017-08-14T17:07:35.491+0000",
"level": {
"description": "Level 1",
"imageURI": "https:",
"name": "Newbie",
"points": 0
},
"locale": "en_US",
"externalContributor": false,
"profile": [
{
"jive_label": "Title",
"value": "Analyst",
"jive_displayOrder": 0,
"jive_showSummaryLabel": false
},
{
"jive_label": "COMPANY ID",
"value": "333333",
"jive_displayOrder": 5,
"jive_showSummaryLabel": false
},
{
"jive_label": "Department",
"value": "46152",
"jive_displayOrder": 6,
"jive_showSummaryLabel": false
},
{
"jive_label": "BUFUGU",
"value": "C06",
"jive_displayOrder": 9,
"jive_showSummaryLabel": false
},
{
"jive_label": "XYZ Company Code",
"value": "DA01",
"jive_displayOrder": 10,
"jive_showSummaryLabel": false
},
{
"jive_label": "Business Purpose",
"value": "C0820",
"jive_displayOrder": 11,
"jive_showSummaryLabel": false
},
{
"jive_label": "Company",
"value": "XYZ",
"jive_displayOrder": 19,
"jive_showSummaryLabel": false
},
{
"jive_label": "Street Address",
"value": "30 NY NY",
"jive_displayOrder": 20,
"jive_showSummaryLabel": false
},
{
"jive_label": "City",
"value": "NYC",
"jive_displayOrder": 21,
"jive_showSummaryLabel": false
},
{
"jive_label": "Province/State",
"value": "NY",
"jive_displayOrder": 22,
"jive_showSummaryLabel": false
},
{
"jive_label": "Postal/Zip Code",
"value": "00000",
"jive_displayOrder": 23,
"jive_showSummaryLabel": false
},
{
"jive_label": "Country",
"value": "United States",
"jive_displayOrder": 24,
"jive_showSummaryLabel": false
},
{
"jive_label": "Preferred Language",
"value": "E",
"jive_displayOrder": 30,
"jive_showSummaryLabel": false
},
{
"jive_label": "Display Name",
"value": "P, M",
"jive_displayOrder": 31,
"jive_showSummaryLabel": false
},
{
"jive_label": "ReportsTo",
"value": "529847279",
"jive_displayOrder": 37,
"jive_showSummaryLabel": false
},
{
"jive_label": "Transit Number",
"value": "46152",
"jive_displayOrder": 38,
"jive_showSummaryLabel": false
},
{
"jive_label": "Transit Description",
"value": "A B C D",
"jive_displayOrder": 39,
"jive_showSummaryLabel": false
}
],
}
预期输出:
推荐答案
使用熊猫from_dict
是解决此问题的一种方法:
Using pandas from_dict
is one way to solve this:
# normalize your dict and drop unwanted fields
data = pd.DataFrame.from_dict(myjson['jive']['profile']).drop(['jive_displayOrder','jive_showSummaryLabel'], axis=1)
# reshape the data
data = pd.pivot_table(data, index=None, columns='jive_label', values='value', aggfunc='max')
# remove unwanted names
data.columns.name = None
data.reset_index(drop=True)
print(data)
BUFUGU BusinessPurpose COMPANY ID City Company Country Department Display Name Postal/Zip Code Preferred Language Province/State ReportsTo Street Address Title Transit Description Transit Number XYZ Company Code
0 C06 C0820 333333 NYC XYZ United States 46152 P, M 00000 E NY 529847279 30 NY NY Analyst A B C D 46152 DA01
这篇关于使用相同的键值从json文件中提取数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文