如何将Python Dict映射到大查询架构 [英] How to map a Python Dict to a Big Query Schema

查看:92
本文介绍了如何将Python Dict映射到大查询架构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一些嵌套值的字典,如下所示:

I have a dict with some nested values as this:

my_dict = {
    "id": 1,
    "name": "test",
    "system": "x",
    "date": "2015-07-27",
    "profile": {
        "location": "My City",
        "preferences": [
            {
                "code": "5",
                "description": "MyPreference",
            }
        ]
    },
    "logins": [
        "2015-07-27 07:01:03",
        "2015-07-27 08:27:41"
    ]
}

而且,我有一个大查询表架构,如下所示:

and, I have a Big Query Table Schema as follows:

schema = {
    "fields": [
        {'name':'id', 'type':'INTEGER', 'mode':'REQUIRED'},
        {'name':'name', 'type':'STRING', 'mode':'REQUIRED'},
        {'name':'date', 'type':'TIMESTAMP', 'mode':'REQUIRED'},
        {'name':'profile', 'type':'RECORD', 'fields':[
            {'name':'location', 'type':'STRING', 'mode':'NULLABLE'},
            {'name':'preferences', 'type':'RECORD', 'mode':'REPEATED', 'fields':[
                {'name':'code', 'type':'STRING', 'mode':'NULLABLE'},
                {'name':'description', 'type':'STRING', 'mode':'NULLABLE'}
            ]},
        ]},
        {'name':'logins', 'type':'TIMESTAMP', 'mode':'REPEATED'}
    ]
}

我想遍历所有原始的my_dict并根据架构的结构构建一个新的dict.换句话说,遍历模式并从原始my_dict中选取正确的值.

I'd like to traverse all the original my_dict and build a new dict based on the structure of the schema. In other words, iterate over the schema and pick up just the right values from the original my_dict.

要构建这样的新字典(请注意,不会复制模式中不存在的字段"system"):

To build a new dict like this (note that the field "system", not present in the schema, is not copied):

new_dict = {
    "id": 1,
    "name": "test",
    "date": "2015-07-27",
    "profile": {
        "location": "My City",
        "preferences": [
            {
                "code": "5",
                "description": "MyPreference",
            }
        ]
    },
    "logins": [
        "2015-07-27 07:01:03",
        "2015-07-27 08:27:41"
    ]
}

在没有嵌套字段迭代简单dict.items()和复制值的情况下可能会更容易,但是如何构建递归访问原始dict的新dict?

It could be easier without the nested fields iterating a simple dict.items() and copy values, but how can I build the new dict accessing the original dict recursively?

推荐答案

我已经建立了一个递归函数来做到这一点.我不确定这是否是更好的方法,但是可以正常工作:

I've build a recursive function to do this. I'm not sure if it's the better way, but worked:

def map_dict_to_bq_schema(source_dict, schema, dest_dict):
    #iterate every field from current schema
    for field in schema['fields']:
        #only work in existant values
        if field['name'] in source_dict:
            #nested field
            if field['type'].lower()=='record' and 'fields' in field:
                #list
                if 'mode' in field and field['mode'].lower()=='repeated':
                    dest_dict[field['name']] = []
                    for item in source_dict[field['name']]:
                        new_item = {}
                        map_dict_to_bq_schema( item, field, new_item )
                        dest_dict[field['name']].append(new_item)
                #record
                else:
                    dest_dict[field['name']] = {} 
                    map_dict_to_bq_schema( source_dict[field['name']], field, dest_dict[field['name']] )
            #list
            elif 'mode' in field and field['mode'].lower()=='repeated':
                dest_dict[field['name']] = []
                for item in source_dict[field['name']]:
                    dest_dict[field['name']].append(item)
            #plain field
            else:
                dest_dict[field['name']]=source_dict[field['name']]

                format_value_bq(source_dict[field['name']], field['type'])

new_dict = {}
map_dict_to_bq_schema (my_dict, schema, new_dict)

这篇关于如何将Python Dict映射到大查询架构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆