AWS EMR - 如何将文件复制到所有节点? [英] AWS EMR - how to copy files to all the nodes?
问题描述
有没有办法通过 EMR 命令行将文件复制到 EMR 集群中的所有节点?我正在使用 presto 并创建了我的自定义插件.问题是我必须在所有节点上安装这个插件.我不想登录所有节点并复制它.
如果您可以控制启动新 EMR,那么您应该考虑使用 EMR 的 bootstrap script
.>
但如果您想在现有 EMR 上执行此操作(引导程序仅在启动期间可用)您可以在 AWS Systems Manager
(ssm) 和 EMR 内置客户端的帮助下完成此操作.
类似(python)的东西:
emr_client = boto3.client('emr')ssm_client = boto3.client('ssm')
- 您可以使用
emr_client.list_instances
获取核心实例列表 - 最后使用
ssm_client.send_command
向每个实例发送命令
Ref : 检查最后一个详细示例 在运行集群的核心节点上安装库的示例
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub-install-kernels-libs.html#emr-jupyterhub-install-libs
注意:如果您要使用 SSM ,则需要将适当的 ssm IAM 策略附加到主节点的 IAM 角色.
is there a way to copy a file to all the nodes in EMR cluster thought EMR command line? I am working with presto and have created my custom plugin. The problem is I have to install this plugin on all the nodes. I don't want to login to all the nodes and copy it.
If you have the control to Bring up a new EMR, then you should consider using the bootstrap script
of the EMR.
But incase you want to do it on Existing EMR (bootstrap is only available during launch time)
You can do this with the help of AWS Systems Manager
(ssm) and EMR inbuilt client.
Something like (python):
emr_client = boto3.client('emr')
ssm_client = boto3.client('ssm')
- You can get the list of core instances using
emr_client.list_instances
- finally send a command to each of these instance using
ssm_client.send_command
Ref : Check the last detailed example Example Installing Libraries on Core Nodes of a Running Cluster
on https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub-install-kernels-libs.html#emr-jupyterhub-install-libs
Note: If you are going with SSM , you need to have proper IAM policy of ssm attached to the IAM role of your master node.
这篇关于AWS EMR - 如何将文件复制到所有节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!