如何在一段时间后自动终止 AWS EMR 集群 [英] How to terminate AWS EMR Cluster automatically after some time

查看:30
本文介绍了如何在一段时间后自动终止 AWS EMR 集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前手头有一项任务,要在一段时间后终止长期运行的 EMR 集群(基于某些指标).Google Dataproc 在此处列出的称为集群计划删除"中具有此功能:https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion

I currently have a task at hand to Terminate a long-running EMR cluster after a set period of time (based on some metric). Google Dataproc has this capability in something called "Cluster Scheduled Deletion" Listed here: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion

这在 EMR 上是可能的吗?也许使用 Cloudwatch 指标?或者我可以编写一个长时间运行的 jar,它会位于 EMR 主节点上,只轮询纱线以获得一些空闲时间指标,然后在一段时间后关闭集群?

Is this something that is possible on EMR natively? Maybe using Cloudwatch metrics? Or can I write a long running jar which will sit on the EMR Master node and just poll yarn for some idle time metric and then shutdown the cluster after a set period of time?

更多说明.我想要一些功能,其中集群基于空闲时间终止一些 x 时间.例如如果集群已经运行了一段时间,但没有作业运行了 1 小时,并且集群只是坐在那里什么也不做,那么我希望能够终止集群.

For more clarification. I would like some functionality wherein the cluster is terminated based on idle for some x amount of time. e.g. If cluster has been up for a while but not jobs have been run for say 1 hour and the cluster is just sitting there doing nothing, then I'd like the ability to terminate the cluster.

推荐答案

最简单的方法将用于 Amazon CloudWatch 的 Amazon EMR 指标和维度.有一个 isIdle 布尔值表示集群不再执行工作".

The easiest method would be used to Amazon EMR Metrics and Dimensions for Amazon CloudWatch. There is an isIdle boolean that "indicates that a cluster is no longer performing work".

您可以创建一个 CloudWatch 警报,如果它在 x 分钟以上为 True,则触发警报.这会向 Amazon SNS 发送一条消息,该消息可以触发 Lambda 函数关闭集群.

You could create a CloudWatch Alarm that says if it is True for more than x minutes, then trigger the alarm. This would send a message to Amazon SNS, which can trigger a Lambda function to shutdown the cluster.

组件:

  • Amazon CloudWatch 警报
  • 亚马逊 SNS 队列
  • AWS Lambda 函数

更新:这显然不合适(见下面的评论).

Update: This apparently isn't suitable (see comments below).

另一种方法是:

  • 使用 Amazon CloudWatch Eventsx 秒安排一次 Lambda 函数
  • Lambda 函数 查找具有特定标签 的任何集群,该标签指示等待关闭的时间(例如40 分钟).如果标签不存在,则集群保持不变.
  • Lambda 函数查询集群状态(不知何故——可能通过 Hadoop API 调用),然后:
    • 如果集群空闲并且没有IdleSince标签,添加一个带有当前时间戳的IdleSince标签
    • 如果集群空闲并且自 IdleSince 标记中的时间戳起已超过 x 分钟,则终止集群.
    • 如果集群空闲,移除Idle From标签(如果有的话)
    • Use Amazon CloudWatch Events to schedule a Lambda function every x seconds
    • The Lambda function looks for any clusters with a particular tag that indicates how long to wait until shutdown (eg 40 minutes). If the tag is not present, the cluster remains untouched.
    • The Lambda function queries the cluster state (somehow -- probably via a Hadoop API call), then:
      • If the cluster is idle and there is no Idle Since tag, add an Idle Since tag with the current timestamp
      • If the cluster is idle and it been more than x minutes since the timestamp in the Idle Since tag, terminate the cluster.
      • If the cluster is not idle, remove the Idle Since tag (if present)

      这篇关于如何在一段时间后自动终止 AWS EMR 集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆