将Json嵌套到具有特定格式的Pandas DataFrame [英] Nested Json to pandas DataFrame with specific format

查看:402
本文介绍了将Json嵌套到具有特定格式的Pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在一个pandas DataFrame中以某种格式格式化Json文件的内容,以便我可以运行pandassql来转换数据并通过评分模型运行它。



文件= C:\scoring_model\json.js(文件内容如下)

  { 
response:{
version:1.1,
token:dsfgf,
body:{
customer {
customer_id:1234567,
verified:true
},
contact:{
email:mr @ abc.com,
mobile_number:0123456789
},
personal:{
gender:m,
title :博士,
last_name:Muster,
first_name:Max,
family_status:single,
dob: 1985-12-23,
}
}
}



<我需要数据框看起来像这样(显然在同一行的所有值,试图格式化为最好的p这个问题是可以的):

  version |令牌| customer_id |验证|电子邮件| mobile_number |性别| 
1.1 | dsfgf | 1234567 | true | mr@abc.com | 0123456789 | m |

title | last_name | first_name | family_status | dob
Dr. |集合|最大|单| | 23.12.1985

我已经看了关于这个话题的所有其他问题,尝试了各种方法来加载JSON文件转换成熊猫

 `with open(r'C:\scoring_model\json.js','r')作为f:`
c = pd.read_json(f.read())

`打开(r'C:\scoring_model\json.js','r')为f:`
c = f.readlines()

尝试了pd.Panel解决方案 Python熊猫:如何在数据框的列中拆分已排序的字典



来自[yo = f.readlines()]的数据帧结果考虑尝试拆分内容每个单元格基于(),并找到一种方法来把拆分内容到不同的列,但没有运气到目前为止。非常感谢您的专业知识。如果你在整个json中作为一个字典(或列表)加载,例如:使用json.load,你可以使用 <$ c

$ p $ 在[11]中:d = {response :{body:{contact:{email:mr@abc.com,mobile_number:0123456789},personal:{last_name:Muster :m,first_name:Max,dob:1985-12-23,family_status:single,title:Dr.,customer:{verified :true,customer_id:1234567}},token:dsfgf,version:1.1}}

在[12]:df = pd。 io.json.json_normalize(d)

在[13]:df.columns = df.columns.map(lambda x:x.split(。)[ - 1])$ ​​b
$ b In [14]:df
Out [14]:
email mobile_number customer_id verified dob family_status first_name gender last_name title token version
0 mr@abc.com 0123456789 1234567 true 1985 -12-23 single Max M Muster Dr. dsfgf 1.1


i need to format the contents of a Json file in a certain format in a pandas DataFrame so that i can run pandassql to transform the data and run it through a scoring model.

file = C:\scoring_model\json.js (contents of 'file' are below)

{
"response":{
  "version":"1.1",
  "token":"dsfgf",
   "body":{
     "customer":{
         "customer_id":"1234567",
         "verified":"true"
       },
     "contact":{
         "email":"mr@abc.com",
         "mobile_number":"0123456789"
      },
     "personal":{
         "gender": "m",
         "title":"Dr.",
         "last_name":"Muster",
         "first_name":"Max",
         "family_status":"single",
         "dob":"1985-12-23",
     }
   }
 }

I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question):

version | token | customer_id | verified | email      | mobile_number | gender |
1.1     | dsfgf | 1234567     | true     | mr@abc.com | 0123456789    | m      |

title | last_name | first_name |family_status | dob
Dr.   | Muster    | Max        | single       | 23.12.1985

I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas

`with open(r'C:\scoring_model\json.js', 'r') as f:`
    c = pd.read_json(f.read())

 `with open(r'C:\scoring_model\json.js', 'r') as f:`
    c = f.readlines()

tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe

with dataframe results from [yo = f.readlines()] thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far. Your expertise is greatly appreciated. Thank you in advance.

解决方案

If you load in the entire json as a dict (or list) e.g. using json.load, you can use json_normalize:

In [11]: d = {"response": {"body": {"contact": {"email": "mr@abc.com", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}}

In [12]: df = pd.io.json.json_normalize(d)

In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1])

In [14]: df
Out[14]:
        email mobile_number customer_id verified         dob family_status first_name gender last_name title  token version
0  mr@abc.com    0123456789     1234567     true  1985-12-23        single        Max      m    Muster   Dr.  dsfgf     1.1

这篇关于将Json嵌套到具有特定格式的Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆