如何编写ETL作业以将mysql数据库表传输到另一个mysql rds数据库 [英] How To write the ETL job to transfer the mysql database table to another mysql rds database

查看:90
本文介绍了如何编写ETL作业以将mysql数据库表传输到另一个mysql rds数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是AWS的新手.我想使用AWS Glue编写ETL脚本,以将数据从一个mysql数据库传输到另一个RDS mysql数据库.

I am new to AWS. I want to write the ETL script using AWS Glue to transfer the data from one mysql database to another RDS mysql database .

请向我建议如何使用AWS胶水完成这项工作

Please suggest me to how to do this job using AWS glue

谢谢

推荐答案

您可以将pymysql或mysql.connector用作添加到胶粘作业中的单独zip文件.我们已经将pymysql用于在AWS Glue/Aurora RDS中运行的所有生产作业

You can use pymysql or mysql.connector as a seperate zip file added to the glue job. We have used pymysql for all our production jobs running in AWS Glue/Aurora RDS

使用此连接器连接到两个RDS Mysql实例.从RDS源db1读取数据到数据帧,执行转换,最后将转换后的数据写入RDS目标DB表.

Use this connectors to connect to both the RDS Mysql instances. Read data from RDS Source db1 into a dataframe, perform the transformations, and finally write the transformed data to the RDS Target DB tables.

这是用于连接到mysql连接器的示例脚本,该脚本将数据从S3加载到登台表中,然后再加载到目标数据库中.

Here is the sample script for connecting to mysql connector, loading data from S3 into a staging table before loading to target database.

conn1 = mysql.connector.connect(host=url1, user=uname1, password=pwd1, database=sourcedbase)
cur1 = conn1.cursor()
cur1, conn1 = connect()

conn2 = mysql.connector.connect(host=url2, user=uname2, password=pwd2, database=targetdbase)
cur2 = conn2.cursor()
cur2, conn2 = connect()

createStgTable1 = "DROP TABLE IF EXISTS mydb.STG_TABLE;"
createStgTable2 = "CREATE TABLE mydb.STG_TABLE(COL1 VARCHAR(50) NOT NULL, COL2 VARCHAR(50), COL3 VARCHAR(50), COL4 CHAR(1) NOT NULL);"
loadQry = "LOAD DATA FROM S3 PREFIX 's3://<bucketname>/folder' REPLACE INTO TABLE mydb.STG_TABLE FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' IGNORE 1 LINES (@var1, @var2, @var3, @var4) SET col1= @var1, col2= @var2, col3= @var3, col4=@var4;"
cur.execute(createStgTable1)
cur.execute(createStgTable2)
cur.execute(loadQry)
conn.commit()

加载数据....." 来自Aurora,用于将数据从S3直接加载到mysql表中.

"Load data....." is from Aurora, to load data from S3 directly into a mysql table.

将查询插入RDS实例:

conn = mysql.connector.connect(host=url, user=uname, password=pwd, database=dbase)
cur = conn.cursor()
insertQry = "INSERT INTO emp (id, emp_name, dept, designation, address1, city, state, active_start_date, is_active) SELECT (SELECT coalesce(MAX(ID),0) + 1 FROM atlas.emp) id, tmp.emp_name, tmp.dept, tmp.designation, tmp.address1, tmp.city, tmp.state, tmp.active_start_date, tmp.is_active from EMP_STG tmp ON DUPLICATE KEY UPDATE dept=tmp.dept, designation=tmp.designation, address1=tmp.address1, city=tmp.city, state=tmp.state, active_start_date=tmp.active_start_date, is_active =tmp.is_active ;"

n = cur.execute(insertQry)
print (" CURSOR status :", n)
conn.close()

这篇关于如何编写ETL作业以将mysql数据库表传输到另一个mysql rds数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆