AWS Autoscaling Group EC2 实例在 cron 作业期间关闭 [英] AWS Autoscaling Group EC2 instances go down during cron jobs

查看:20
本文介绍了AWS Autoscaling Group EC2 实例在 cron 作业期间关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了自动缩放组,或者只是一堆由负载均衡器绑定的 EC2 实例.乍一看,这两个配置都运行良好.

I tried autoscaling groups and alternatively just a bunch of EC2 instances tied by load balancer. Both configs are working fine at first glance.

但是,当 EC2 是自动缩放组的一部分时,它有时会关闭.实际上这种情况经常发生,几乎每天一次.他们在硬重置"中失败了.大大地.ec2 监控图显示 CPU 使用率上升到 100%,然后实例变得没有响应,然后被自动缩放组终止.

But, when the EC2 is a part of autoscaling group it goes down sometimes. Actually it happens very often, almost once a day. And they go down in a "hard reset" way. The ec2 monitoring graphs show that CPU usage goes up to 100%, then the instance become not responsive and then it is terminated by autoscaling group.

这与我在这些实例上的流程无关.

And it has nothing to do with my processes on these instances.

当实例不属于 Autoscaling 组时,它可以在没有 CPU 使用率峰值的情况下工作多年.

When the instance is not a part of Autoscaling groups, it can work without the CPU usage spikes for years.

硬重置"自动缩放组实例正在阻止我的 cron 作业.尽管我喜欢自动缩放组,但我不能使用它.

The "hard reset" on autoscaling group instances are braking my cron jobs. As much as I like the autoscaling groups I cannot use it.

是否有处理硬重置"的标准方法?

It there a standard way to deal with the "hard resets"?

附注.

就我而言,cron 作业正在 Ubuntu 上运行 PHP 脚本.我设法只让一个实例运行该作业.

The cron jobs are running PHP scripts on Ubuntu in my case. I managed to make only one instance running the job.

推荐答案

听起来您的运行状况检查在您的 cron 运行时失败,导致实例停止服务.

It sounds like you have a health check that is failing when your cron is running, as as a result the instance is being taken out of service.

如果您查看 ASG,应该会列出删除实例的原因.这通常是健康检查失败,但也可能有其他原因.

If you look at the ASG, there should be a reason listed for why the instance was taken out. This will usually be a health check failure, but there could be other reasons as well.

您可以采取一些措施来解决此问题.

There are a couple things you can do to fix this.

首先,确定为什么您的 cron 占用了 100% 的 CPU,以及它通常需要多长时间.

First, determine why your cron is taking 100% of CPU, and how long it generally takes.

检查您的健康检查设置.您使用的是 HTTP 还是 TCP?间隔是多少?在停止服务之前必须失败多少次检查?

Review your health check settings. Are you using HTTP or TCP? What is the interval, and how many checks have to fail before it is taken out of service?

在这两个项目之间,您应该能够调整健康检查,以便它不会在 cron 运行期间停止服务.实例可能出现故障,这通常是因为它的内存不足.如果是这种情况,您可能需要考虑使用大型实例类型和/或启用交换.

Between those two items, you should be able to adjust the health checks so that it doesn't take it out of service during the cron running time. It is possible that the instance is failing, typically this would be because it runs out of memory. If that is the case, you may want to consider going to a large instance type and/or enabling swap.

这篇关于AWS Autoscaling Group EC2 实例在 cron 作业期间关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆