如何将嵌套的json的多层转换为sql表 [英] how to convert multiple layers of nested json to sql table

查看:823
本文介绍了如何将嵌套的json的多层转换为sql表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

借助StackOverflow的帮助,到目前为止,我已经可以做到这一点.需要更多帮助将JSON转换为SQL表.任何帮助都将受到高度赞赏.

with the help over StackOverflow, I was able to get thus far with this. Need some more help converting JSON to SQL table. Any help is highly appreciated.

{
    "Volumes": [{
        "AvailabilityZone": "us-east-1a",
        "Attachments": [{
            "AttachTime": "2013-12-18T22:35:00.000Z",
            "InstanceId": "i-1234567890abcdef0",
            "VolumeId": "vol-049df61146c4d7901",
            "State": "attached",
            "DeleteOnTermination": true,
            "Device": "/dev/sda1",

            "Tags": [{
                "Value": "DBJanitor-Private",
                "Key": "Name"
            }, {
                "Value": "DBJanitor",
                "Key": "Owner"
            }, {
                "Value": "Database",
                "Key": "Product"
            }, {
                "Value": "DB Janitor",
                "Key": "Portfolio"
            }, {
                "Value": "DB Service",
                "Key": "Service"
            }]
        }],
            "Ebs": {
                                "Status": "attached",
                                "DeleteOnTermination": true,
                                "VolumeId": "vol-049df61146c4d7901",
                                "AttachTime": "2016-09-14T19:49:11.000Z"
                            },
        "VolumeType": "standard",
        "VolumeId": "vol-049df61146c4d7901"
    }]
}

借助StackOverFlow的帮助,我能够解决直到Tag.无法弄清楚如何解决Ebs件.我对编码非常陌生,对您的帮助深表感谢.

With the help over StackOverFlow, I was able to solve until Tags. Cant figure out how to solve Ebs piece. I'm pretty new to coding and any help is deeply appreaciated.

In [1]: fn = r'D:\temp\.data\40454898.json'

In [2]: with open(fn) as f:
   ...:     data = json.load(f)
   ...:

In [14]: t = pd.io.json.json_normalize(data['Volumes'],
    ...:                               ['Attachments','Tags'],
    ...:                               [['Attachments', 'VolumeId'],
    ...:                                ['Attachments', 'InstanceId']])
    ...:

In [15]: t
Out[15]:
         Key              Value Attachments.InstanceId   Attachments.VolumeId
0       Name  DBJanitor-Private    i-1234567890abcdef0  vol-049df61146c4d7901
1      Owner          DBJanitor    i-1234567890abcdef0  vol-049df61146c4d7901
2    Product           Database    i-1234567890abcdef0  vol-049df61146c4d7901
3  Portfolio         DB Janitor    i-1234567890abcdef0  vol-049df61146c4d7901
4    Service         DB Service    i-1234567890abcdef0  vol-049df61146c4d7901

谢谢

推荐答案

json_normalize期望词典的列表,如果是Ebs-它只是一个字典,因此我们应该对其进行预处理JSON数据:

json_normalize expects a list of dictionaries and in case of Ebs - it's just a dictionary, so we should preprocess the JSON data:

In [88]: with open(fn) as f:
    ...:     data = json.load(f)
    ...:

In [89]: for r in data['Volumes']:
    ...:     if 'Ebs' not in r: # add 'Ebs' dict if it's not in the record...
    ...:         r['Ebs'] = []
    ...:     if not isinstance(r['Ebs'], list): # wrap 'Ebs' in a list if it's not a list 
    ...:         r['Ebs'] = [r['Ebs']]
    ...:

In [90]: data
Out[90]:
{'Volumes': [{'Attachments': [{'AttachTime': '2013-12-18T22:35:00.000Z',
     'DeleteOnTermination': True,
     'Device': '/dev/sda1',
     'InstanceId': 'i-1234567890abcdef0',
     'State': 'attached',
     'Tags': [{'Key': 'Name', 'Value': 'DBJanitor-Private'},
      {'Key': 'Owner', 'Value': 'DBJanitor'},
      {'Key': 'Product', 'Value': 'Database'},
      {'Key': 'Portfolio', 'Value': 'DB Janitor'},
      {'Key': 'Service', 'Value': 'DB Service'}],
     'VolumeId': 'vol-049df61146c4d7901'}],
   'AvailabilityZone': 'us-east-1a',
   'Ebs': [{'AttachTime': '2016-09-14T19:49:11.000Z',
     'DeleteOnTermination': True,
     'Status': 'attached',
     'VolumeId': 'vol-049df61146c4d7901'}],
   'VolumeId': 'vol-049df61146c4d7901',
   'VolumeType': 'standard'}]}

注意:'Ebs': {..}已替换为'Ebs': [{..}]

In [91]: e = pd.io.json.json_normalize(data['Volumes'],
    ...:                               ['Ebs'],
    ...:                               ['VolumeId'],
    ...:                               meta_prefix='parent_')
    ...:


In [92]: e
Out[92]:
                 AttachTime DeleteOnTermination    Status               VolumeId        parent_VolumeId
0  2016-09-14T19:49:11.000Z                True  attached  vol-049df61146c4d7901  vol-049df61146c4d7901

这篇关于如何将嵌套的json的多层转换为sql表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆