胶水AWS在boto3 python上创建数据目录表 [英] Glue AWS creating a data catalog table on boto3 python

查看:124
本文介绍了胶水AWS在boto3 python上创建数据目录表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用python API在我们的数据目录中创建一个表.在此处文档此示例

I have been trying to create a table within our data catalog using the python API. Following the documentation posted here and here for the API. I can understand how that goes. Nevertheless, I need to undestand how to declare a field structure when I create the table because when I take a look on the Storage Definition for the table here there is any explanation about how should I define this type of column for my table. In addition. I dont see the classification property for the table where is covered. Maybe on properties? I have used the boto3 documentation for this sample

代码:

import boto3


client = boto3.client(service_name='glue', region_name='us-east-1')


response = client.create_table(
        DatabaseName='dbname',
        TableInput={
        'Name': 'tbname',
        'Description': 'tb description',
        'Owner': 'I'm',
        'StorageDescriptor': {
            'Columns': [

                { 'Name': 'agents', 'Type': 'struct','Comment': 'from deserializer'  },
                { 'Name': 'conference_sid', 'Type': 'string','Comment': 'from deserializer'  },
                { 'Name': 'call_sid', 'Type': 'string','Comment': 'from deserializer'  }
            ] ,
        'Location': 's3://bucket/location/', 
        'InputFormat': 'org.apache.hadoop.mapred.TextInputFormat',
        'OutputFormat': 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat',
        'Compressed': False,
        'SerdeInfo': {  'SerializationLibrary': 'org.openx.data.jsonserde.JsonSerDe'}
        },
        'TableType' : "EXTERNAL_TABLE"} )

推荐答案

找到这篇文章是因为我遇到了同样的问题,并最终找到了解决方案,所以您可以按以下类型进行操作:

Found this post because I ran into the same issue and eventually found the solution so you could do as type:

array<struct<id:string,timestamp:bigint,message:string>>

我在使用AWS控制台并单击通过爬网程序创建的现有表的数据类型时发现了此提示".它提示:

I found this "hint" while using the AWS Console and clicking on a data type of an existing table created via a Crawler. It hints:

An ARRAY of scalar type as a top - level column.
ARRAY <STRING>

An ARRAY with elements of complex type (STRUCT).
ARRAY < STRUCT <
  place: STRING,
  start_year: INT
>>

An ARRAY as a field (CHILDREN) within a STRUCT. (The STRUCT is inside another    ARRAY, because it is rare for a STRUCT to be a top-level column.)
ARRAY < STRUCT <
  spouse: STRING,
  children: ARRAY <STRING>
>>

A STRUCT as the element type of an ARRAY.
ARRAY < STRUCT <
  street: STRING,
  city: STRING,
  country: STRING
>>

这篇关于胶水AWS在boto3 python上创建数据目录表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆