AWS Glue-如何使用BOTO3更改Glue Catalog表中的列名? [英] AWS Glue - how to change column names in Glue Catalog table using BOTO3?

查看:97
本文介绍了AWS Glue-如何使用BOTO3更改Glue Catalog表中的列名?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用AWS Glue Crawlers读取S3 zip文件(无标题)并填充Glue Catalog.

I am using AWS Glue Crawlers to read from S3 zip files (without header) and populate Glue Catalog.

列的默认名称为: col_0 col_1 ...

Columns are named by default: col_0, col_1...

如何使用例如python boto3模块并直接与AWS Glue目录进行交互?

How to change those column names using e.g. python boto3 module and interact with AWS Glue catalog directly?

是否有用于执行此操作的示例代码段?

Is there example snippet for doing this?

谢谢.

推荐答案

您可以尝试拉表并更新名称.这是我会做的一个例子.

You can try pulling the tables and updating the names. Here is an example of what I would do.

首先,我们将尝试检索表:

First we'll try and retrieve the table:

    database_name = 'ENTER TABLE NAME'
    table_name = 'ENTER TABLE NAME'
    response = self.glue_client.get_table(DatabaseName=database_name,table_name=Name)
    old_table = response['Table']

接下来,我们将使用我们想要更改的值来更新表.我们创建的新表只能具有某些字段,以便update_table接受它.因此,我们将执行以下操作.

Next we'll update the table with the values we want changed. The new table we create can only have certain fields in order for the update_table to accept it. So we'll do the following.

    field_names = [
      "Name",
      "Description",
      "Owner",
      "LastAccessTime",
      "LastAnalyzedTime",
      "Retention",
      "StorageDescriptor",
      "PartitionKeys",
      "ViewOriginalText",
      "ViewExpandedText",
      "TableType",
      "Parameters"
    ]
    new_table = dict()
    for key in field_names:
     if key in old_table:
      new_table[key] = old_table[key]

现在有了更新的表,我们可以操作列名了.这是将'col_0'更改为'new_col'的示例

Now that we have the updated table, we can manipulate the column names. Here is an example of changing just 'col_0' to 'new_col'

    for col in new_table['StorageDescriptor']['Columns']:
      if col['Name'] == 'col_0':
        col['Name'] = 'new_col' 
    response=self.glue_client.update_table(DatabaseName=database_name,TableInput=new_table)

希望这会有所帮助!

这篇关于AWS Glue-如何使用BOTO3更改Glue Catalog表中的列名?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆