AWS Glue-插入之前截断目标postgres表 [英] AWS Glue - Truncate destination postgres table prior to insert

查看:156
本文介绍了AWS Glue-插入之前截断目标postgres表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在插入之前截断postgres目标表,并且一般来说,我试图利用GLUE中已经创建的连接来触发外部函数。

I am trying to truncate a postgres destination table prior to insert, and in general, trying to fire external functions utilizing the connections already created in GLUE.

有人能这样做吗?

推荐答案

我曾尝试过 DROP / TRUNCATE 方案,但无法使用已经在Glue中创建的连接来实现,而只能使用纯Python PostgreSQL驱动程序 pg8000

I've tried the DROP/ TRUNCATE scenario, but have not been able to do it with connections already created in Glue, but with a pure Python PostgreSQL driver, pg8000.


  1. 下载来自pypi的pg8000

  2. 在根文件夹中创建一个空的 __ init __。py

  3. 压缩内容&上传到S3

  4. 引用作业的 Python库路径中的zip文件

  5. 将数据库连接详细信息设置为作业参数(确保在所有键名前添加-)。勾选服务器端加密框。

  1. Download the tar of pg8000 from pypi
  2. Create an empty __init__.py in the root folder
  3. Zip up the contents & upload to S3
  4. Reference the zip file in the Python lib path of the job
  5. Set the DB connection details as job params (make sure to prepend all key names with --). Tick the "Server-side encryption" box.

然后,您可以简单地创建连接并执行SQL。

Then you can simply create a connection and execute SQL.

import sys
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame
from awsglue.job import Job

import pg8000

args = getResolvedOptions(sys.argv, [
    'JOB_NAME',
    'PW',
    'HOST',
    'USER',
    'DB'
])
# ...
# Create Spark & Glue context

job = Job(glueContext)
job.init(args['JOB_NAME'], args)

# ...
config_port = 5432
conn = pg8000.connect(
    database=args['DB'], 
    user=args['USER'], 
    password=args['PW'],
    host=args['HOST'],
    port=config_port
)
query = "TRUNCATE TABLE {0};".format(".".join([schema, table]))
cur = conn.cursor()
cur.execute(query)
conn.commit()
cur.close()
conn.close()

这篇关于AWS Glue-插入之前截断目标postgres表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆