如何将嵌套的json的多层转换为sql表 [英] how to convert multiple layers of nested json to sql table
本文介绍了如何将嵌套的json的多层转换为sql表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
借助StackOverflow的帮助,到目前为止,我已经可以做到这一点.需要更多帮助将JSON转换为SQL表.任何帮助都将受到高度赞赏.
with the help over StackOverflow, I was able to get thus far with this. Need some more help converting JSON to SQL table. Any help is highly appreciated.
{
"Volumes": [{
"AvailabilityZone": "us-east-1a",
"Attachments": [{
"AttachTime": "2013-12-18T22:35:00.000Z",
"InstanceId": "i-1234567890abcdef0",
"VolumeId": "vol-049df61146c4d7901",
"State": "attached",
"DeleteOnTermination": true,
"Device": "/dev/sda1",
"Tags": [{
"Value": "DBJanitor-Private",
"Key": "Name"
}, {
"Value": "DBJanitor",
"Key": "Owner"
}, {
"Value": "Database",
"Key": "Product"
}, {
"Value": "DB Janitor",
"Key": "Portfolio"
}, {
"Value": "DB Service",
"Key": "Service"
}]
}],
"Ebs": {
"Status": "attached",
"DeleteOnTermination": true,
"VolumeId": "vol-049df61146c4d7901",
"AttachTime": "2016-09-14T19:49:11.000Z"
},
"VolumeType": "standard",
"VolumeId": "vol-049df61146c4d7901"
}]
}
借助StackOverFlow的帮助,我能够解决直到Tag.无法弄清楚如何解决Ebs件.我对编码非常陌生,对您的帮助深表感谢.
With the help over StackOverFlow, I was able to solve until Tags. Cant figure out how to solve Ebs piece. I'm pretty new to coding and any help is deeply appreaciated.
In [1]: fn = r'D:\temp\.data\40454898.json'
In [2]: with open(fn) as f:
...: data = json.load(f)
...:
In [14]: t = pd.io.json.json_normalize(data['Volumes'],
...: ['Attachments','Tags'],
...: [['Attachments', 'VolumeId'],
...: ['Attachments', 'InstanceId']])
...:
In [15]: t
Out[15]:
Key Value Attachments.InstanceId Attachments.VolumeId
0 Name DBJanitor-Private i-1234567890abcdef0 vol-049df61146c4d7901
1 Owner DBJanitor i-1234567890abcdef0 vol-049df61146c4d7901
2 Product Database i-1234567890abcdef0 vol-049df61146c4d7901
3 Portfolio DB Janitor i-1234567890abcdef0 vol-049df61146c4d7901
4 Service DB Service i-1234567890abcdef0 vol-049df61146c4d7901
谢谢
推荐答案
json_normalize
期望词典的列表,如果是Ebs
-它只是一个字典,因此我们应该对其进行预处理JSON数据:
json_normalize
expects a list of dictionaries and in case of Ebs
- it's just a dictionary, so we should preprocess the JSON data:
In [88]: with open(fn) as f:
...: data = json.load(f)
...:
In [89]: for r in data['Volumes']:
...: if 'Ebs' not in r: # add 'Ebs' dict if it's not in the record...
...: r['Ebs'] = []
...: if not isinstance(r['Ebs'], list): # wrap 'Ebs' in a list if it's not a list
...: r['Ebs'] = [r['Ebs']]
...:
In [90]: data
Out[90]:
{'Volumes': [{'Attachments': [{'AttachTime': '2013-12-18T22:35:00.000Z',
'DeleteOnTermination': True,
'Device': '/dev/sda1',
'InstanceId': 'i-1234567890abcdef0',
'State': 'attached',
'Tags': [{'Key': 'Name', 'Value': 'DBJanitor-Private'},
{'Key': 'Owner', 'Value': 'DBJanitor'},
{'Key': 'Product', 'Value': 'Database'},
{'Key': 'Portfolio', 'Value': 'DB Janitor'},
{'Key': 'Service', 'Value': 'DB Service'}],
'VolumeId': 'vol-049df61146c4d7901'}],
'AvailabilityZone': 'us-east-1a',
'Ebs': [{'AttachTime': '2016-09-14T19:49:11.000Z',
'DeleteOnTermination': True,
'Status': 'attached',
'VolumeId': 'vol-049df61146c4d7901'}],
'VolumeId': 'vol-049df61146c4d7901',
'VolumeType': 'standard'}]}
注意:'Ebs': {..}
已替换为'Ebs': [{..}]
In [91]: e = pd.io.json.json_normalize(data['Volumes'],
...: ['Ebs'],
...: ['VolumeId'],
...: meta_prefix='parent_')
...:
In [92]: e
Out[92]:
AttachTime DeleteOnTermination Status VolumeId parent_VolumeId
0 2016-09-14T19:49:11.000Z True attached vol-049df61146c4d7901 vol-049df61146c4d7901
这篇关于如何将嵌套的json的多层转换为sql表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文