自动缩放期间如何处理Web流量突然增加 [英] How to handle a sudden spike in web traffic during Autoscaling

查看:123
本文介绍了自动缩放期间如何处理Web流量突然增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在ELB后面和Auto Scaling组中有两个EC2实例.放大政策如下:

I've two EC2 instances behind an ELB and in an Auto Scaling Group. The Scale-up policy is as below:

CPU利用率> = 70持续300秒(添加一台服务器)

CPUUtilization >= 70 for 300 seconds (Adds one server)

正在进行Atoscaling活动时,现有实例上的负载已达到99%,并且连接已断开.

While Atoscaling activity is taking place, load on existing instances is already 99% and connections are being dropped.

有什么办法可以更有效地处理这个问题?

Is there any way to handle this more efficiently?

推荐答案

Auto Scaling的技巧是定义一个警报,该警报可以准确识别系统的负载.

The trick to Auto Scaling is in defining an alarm that can accurately identify the load of your system.

CPU利用率并非始终是正确的使用方法-您的应用程序可能只能处理有限数量的连接,它可能会被压缩在RAM上,并且请求的类型也可能会有所不同.

CPU Utilization is not always the right measure to use -- your application might only be able to handle a limited number of connections, it might be squeezed on RAM and the types of requests might vary too.

一个好主意是在峰值负载期间密切监视系统,以确定准确的信号,以识别繁忙时段(或者更好地,它可以帮助您预测即将到来的繁忙时段).在您的各个实例上使用标准的监视工具,例如监视可用内存,应用程序用户数,事务数等.

A good idea is to monitor your system closely during peak loads to determine an accurate signal that identifies busy periods (or, even better, helps you predict impending busy periods). Use standard monitoring tools on your individual instances, such as monitoring free memory, number of application users, number of transactions, etc.

您可以使用常规的监视工具,也可以编写将指标推送到Amazon CloudWatch的内容,从而超越CloudWatch通常提供的基本CPU和网络指标.您甚至可以在应用程序速度变慢(需要自定义代码)时使用负载均衡器的 Latency 指标触发扩展.

You can use normal monitoring tools, or you can write something that pushes metrics to Amazon CloudWatch, so that you go beyond the basic CPU and Network metrics that CloudWatch normally provides. You could even use the Load Balancer's Latency metric to trigger scaling when the application slows down (custom code required).

一旦您有可靠的信号来检测系统何时接近容量并且需要横向扩展,则可以集中精力缩短添加新容量的时间.测量新实例启动并开始接受流量所需的时间.尝试通过使用完全配置的AMI来减少启动时间,而不是通过User Data安装软件.也许您可以删除或关闭实例上的服务以使其启动更快.尝试使用不同的EBS卷类型(例如,通用SSD最多可以爆发3000 IOP)和不同的实例类型.

Once you have a reliable signal to detect when the system is approaching capacity and needs to scale-out, you can then concentrate on shortening the time to add new capacity. Measure the time it takes for a new instance to launch and start accepting traffic. Try to reduce launch times by using a fully-configured AMI rather than installing software via User Data. Maybe you can remove or turn-off services on the instance to make it start faster. Try using different EBS volume types (eg General Purpose SSD can burst up to 3000 IOPs) and different Instance Types.

也许甚至更早地扩展(例如以50%的价格)-与为用户提供的改进服务相比,额外费用可能很小.

Perhaps even scale-out earlier (eg at 50%) -- the extra expense could be minor compared to the improved service to your users.

您的目标应该是确保用户永远不会出现服务速度慢或连接断开的情况.

Your goal should be ensuring that users never have slow service or dropped connections.

这篇关于自动缩放期间如何处理Web流量突然增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆