将凭据发送到Google Dataflow作业 [英] Sending credentials to Google Dataflow jobs
问题描述
将凭据传递给Dataflow作业的正确方法是什么?
What is the right way to pass credentials to Dataflow jobs?
我的某些Dataflow作业需要凭据才能进行REST调用以及获取/发布处理后的数据.
Some of my Dataflow jobs need credentials to make REST calls and fetch/post processed data.
我当前正在使用环境变量将凭据传递给JVM,将其读入Serializable对象,然后将其传递给DoFn实现的构造函数.我不确定这是正确的方法,因为任何可序列化的类都不应包含敏感信息.
I am currently using environment variables to pass the credentials to the JVM, read them into a Serializable object and pass them on to the DoFn implementation's constructor. I am not sure this is the right approach as any class which is Serializable should not contain sensitive information.
我想到的另一种方法是将凭证存储在GCS中,并使用服务帐户密钥文件检索它们,但是想知道为什么我的工作应该执行从GCS读取凭证的任务.
Another way I thought of is to store the credential in GCS and retrieve them using service account key file, but was wondering why should my job execute this task of reading credentials from GCS.
推荐答案
Google Cloud Dataflow不具有传递或存储安全机密的本机支持.但是,您可以按照建议使用Cloud KMS和/或GCS在运行时使用Dataflow服务帐户凭据读取机密.
Google Cloud Dataflow does not have native support for passing or storing secured secrets. However you can use Cloud KMS and/or GCS as you propose to read a secret at runtime using your Dataflow service account credentials.
If you read the credential at runtime from a DoFn
, you can use the DoFn.Setup
lifecycle API to read the value once and cache it for the lifetime of the DoFn
.
您可以在此处了解有关Google Cloud中秘密管理的各种选项:使用Cloud进行秘密管理KMS .
You can learn about various options for secret management in Google Cloud here: Secret management with Cloud KMS.
这篇关于将凭据发送到Google Dataflow作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!