使用jdbc从Redshift表中截断Spark 2.0.0 [英] Spark 2.0.0 truncate from Redshift table using jdbc
问题描述
您好,我要在Redshift中使用Spark SQL(2.0.0)来截断表.我正在使用 spark-redshift 包&我想知道如何截断表格.任何人都可以分享这个示例
Hello I am using Spark SQL(2.0.0) with Redshift where I want to truncate my tables. I am using this spark-redshift package & I want to know how I can truncate my table.Can anyone please share example of this ??
推荐答案
我无法使用Spark和上面列出的spark-redshift存储库中的代码来完成此任务.
I was unable to accomplish this using Spark and the code in the spark-redshift repo that you have listed above.
但是,我能够将AWS Lambda与psycopg2配合使用来截断redshift表.然后,我使用boto3通过AWS Glue启动我的Spark作业.
I was, however, able to use AWS Lambda with psycopg2 to truncate a redshift table. Then I use boto3 to kick off my spark job via AWS Glue.
下面的重要代码是cur.execute("truncate table yourschema.yourtable")
The important code below is cur.execute("truncate table yourschema.yourtable")
from __future__ import print_function
import sys
import psycopg2
import boto3
def lambda_handler(event, context):
db_database = "your_redshift_db_name"
db_user = "your_user_name"
db_password = "your_password"
db_port = "5439"
db_host = "your_redshift.hostname.us-west-2.redshift.amazonaws.com"
try:
print("attempting to connect...")
conn = psycopg2.connect(dbname=db_database, user=db_user, password=db_password, host=db_host, port=db_port)
print("connected...")
conn.autocommit = True
cur = conn.cursor()
count_sql = "select count(pivotid) from yourschema.yourtable"
cur.execute(count_sql)
results = cur.fetchone()
print("countBefore: ", results[0])
countOfPivots = results[0]
if countOfPivots > 0:
cur.execute("truncate table yourschema.yourtable")
print("truncated yourschema.yourtable")
cur.execute(count_sql)
results = cur.fetchone()
print("countAfter: ", results[0])
cur.close()
conn.close()
glueClient = boto3.client("glue")
startTriiggerResponse = glueClient.start_trigger(Name="your-awsglue-ondemand-trigger")
print("startedTrigger:", startTriiggerResponse.Name)
return results
except Exception as e:
print(e)
raise e
这篇关于使用jdbc从Redshift表中截断Spark 2.0.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!