一段时间后如何自动终止AWS EMR集群 [英] How to terminate AWS EMR Cluster automatically after some time

查看:273
本文介绍了一段时间后如何自动终止AWS EMR集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前手头有一个任务,要在设定的时间段(基于某种指标)之后终止长时间运行的EMR集群. Google Dataproc在以下列出的群集预定删除"中具有此功能: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion

I currently have a task at hand to Terminate a long-running EMR cluster after a set period of time (based on some metric). Google Dataproc has this capability in something called "Cluster Scheduled Deletion" Listed here: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion

这在EMR上是可能的吗?也许使用Cloudwatch指标?还是我可以编写一个长时间运行的jar,该jar将位于EMR主节点上,并且只对纱线进行轮询以获取一些空闲时间度量,然后在设置的时间段后关闭集群?

Is this something that is possible on EMR natively? Maybe using Cloudwatch metrics? Or can I write a long running jar which will sit on the EMR Master node and just poll yarn for some idle time metric and then shutdown the cluster after a set period of time?

更多说明.我想要一些功能,其中群集根据闲置时间x终止.例如如果集群已经运行了一段时间,但作业没有运行例如1个小时,而集群只是坐在那里无所事事,那么我希望能够终止集群.

For more clarification. I would like some functionality wherein the cluster is terminated based on idle for some x amount of time. e.g. If cluster has been up for a while but not jobs have been run for say 1 hour and the cluster is just sitting there doing nothing, then I'd like the ability to terminate the cluster.

推荐答案

最简单的方法将用于

The easiest method would be used to Amazon EMR Metrics and Dimensions for Amazon CloudWatch. There is an isIdle boolean that "indicates that a cluster is no longer performing work".

您可以创建一个CloudWatch警报,说它为True的时间超过 x 分钟,然后触发警报.这会将消息发送到Amazon SNS,后者可以触发Lambda函数以关闭集群.

You could create a CloudWatch Alarm that says if it is True for more than x minutes, then trigger the alarm. This would send a message to Amazon SNS, which can trigger a Lambda function to shutdown the cluster.

组件:

  • Amazon CloudWatch警报
  • Amazon SNS队列
  • AWS Lambda函数

更新:这显然不合适(请参见下面的评论).

Update: This apparently isn't suitable (see comments below).

另一种方法是:

  • 使用 Amazon CloudWatch Events 每隔 x 秒安排一次Lambda函数
  • Lambda函数查找具有特定标记的任何群集,这些群集指示关闭前要等待多长时间(例如40分钟).如果标签不存在,则群集保持不变.
  • Lambda函数查询集群状态(以某种方式-可能通过Hadoop API调用),然后:
    • 如果群集处于空闲状态,并且没有 Idle since 标签,请添加一个带有当前时间戳记的 Idle since 标签
    • 如果群集处于空闲状态,并且距 Idle since 标记中的时间戳超过 x 分钟,请终止群集.
    • 如果集群不是 空闲,则删除 Idle since 标记(如果存在)
    • Use Amazon CloudWatch Events to schedule a Lambda function every x seconds
    • The Lambda function looks for any clusters with a particular tag that indicates how long to wait until shutdown (eg 40 minutes). If the tag is not present, the cluster remains untouched.
    • The Lambda function queries the cluster state (somehow -- probably via a Hadoop API call), then:
      • If the cluster is idle and there is no Idle Since tag, add an Idle Since tag with the current timestamp
      • If the cluster is idle and it been more than x minutes since the timestamp in the Idle Since tag, terminate the cluster.
      • If the cluster is not idle, remove the Idle Since tag (if present)

      这篇关于一段时间后如何自动终止AWS EMR集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆