使用Scala提取嵌入式AWS Glue连接凭证 [英] Extract Embedded AWS Glue Connection Credentials Using Scala

查看:78
本文介绍了使用Scala提取嵌入式AWS Glue连接凭证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个胶粘作业,可以直接从redshift读取数据,为此,必须提供连接凭据.我创建了一个嵌入式胶粘连接,可以使用以下 pyspark 代码提取凭据.有没有办法在 Scala 中做到这一点?

I have a glue job that reads directly from redshift, and to do that, one has to provide connection credentials. I have created an embedded glue connection and can extract the credentials with the following pyspark code. Is there a way to do this in Scala?

glue = boto3.client('glue', region_name='us-east-1')
    
response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)
    
table = spark.read.format(
    'com.databricks.spark.redshift'
).option(
    'url',
    'jdbc:redshift://prod.us-east-1.redshift.amazonaws.com:5439/db'
).option(
    'user',
    response['Connection']['ConnectionProperties']['USERNAME']
).option(
    'password',
    response['Connection']['ConnectionProperties']['PASSWORD']
).option(
    'dbtable',
    'db.table'
).option(
    'tempdir',
    's3://config/glue/temp/redshift/'
).option(
    'forward_spark_s3_credentials', 'true'
).load()

推荐答案

AWS没有等同于Scala的组件可以发出此API调用.但是,您可以按照以下answer .

There is no scala equivalent from AWS to issue this API call.But you can use Java SDK code inside scala as mentioned in this answer.

This is the Java SDK call for getConnection and if you don't want to do this then you can follow below approach:

  1. 创建AWS Glue python shell作业并检索连接信息.

  1. Create AWS Glue python shell job and retrieve the connection information.

一旦有了值,就调用另一个scala Glue作业,并将它们作为参数作为python shell作业中的参数,如下所示:

Once you have the values then call the other scala Glue job with these as arguments inside your python shell job as shown below :

glue = boto3.client('glue',region_name ='us-east-1')

glue = boto3.client('glue', region_name='us-east-1')

response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)

response = client.start_job_run(
               JobName = 'my_scala_Job',
               Arguments = {
                 '--username': response['Connection']['ConnectionProperties']['USERNAME'],
                 '--password': response['Connection']['ConnectionProperties']['PASSWORD'] } )

  1. 然后使用getResolvedOptions在scala作业中访问这些参数,如下所示:

导入com.amazonaws.services.glue.util.GlueArgParser

import com.amazonaws.services.glue.util.GlueArgParser

val args = GlueArgParser.getResolvedOptions(
  sysArgs, Array(
    "username",
    "password")
)
val user = args("username")
val pwd  = args("password")

这篇关于使用Scala提取嵌入式AWS Glue连接凭证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆