如何在Ubuntu上设置AWS CloudWatch的代理以获取(正确的)自定义指标,例如cpu,内存和磁盘使用率% [英] How to setup AWS CloudWatch's agent at Ubuntu to get (correct) custom metrics like cpu, memory and disk usage %
问题描述
我正在运行一个AWS EC2 m5.large(不可爆裂的实例).我已经在仪表板上设置了AWS CloudWatch的默认指标(CPU%)之一和一些自定义指标(内存+磁盘使用率).
I'm running an AWS EC2 m5.large (a none burstable instance). I have setup one of AWS CloudWatch's default metrics (CPU %) + some custom metrics (memory + disk usage) in my dashboard.
但是当我比较CloudWatch报告的数字时,它们与我登录Ubuntu 20.04服务器时的实际使用情况相差甚远...
But when I compare the numbers CloudWatch report to me they are pretty far from then actually usage of the Ubuntu 20.04 server when I log in to it...
实际用法:
CPU: ~ 35 %
Memory: ~ 33 %
CloudWatch报告:
CloudWatch report:
CPU ~ 10 %
Memory: ~ 50-55
https://www.screencast.com/t/o1nAnOFjVZW
我已经按照AWS自己的说明添加了内存和磁盘使用量指标(因为CloudWatch并非开箱即用,可以访问O/S级别的内容):
I have followed AWS own instructions to add the metrics for memory and disk usage (Because CloudWatch doesn't out of the box have access to O/S level stuff): https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/mon-scripts.html
如果数字彼此之间距离太远,那么就不可能设置有用的警报和通知.我真不敢相信,AWS希望向选择遵循原始说明的人们提供什么? 唯一完全匹配的是磁盘使用率%.
When numbers are so far from each other - then it would be impossible to setup useful alarms and notifications. I can't believe that is what AWS wants to provide to the people who chose to followed their original instructions? The only thing with match exactly is the disk usage %.
推荐答案
如何在UBUNTU 20.04上安装AWS Agent(替代旧脚本的新方法:"CloudWatchMonitoringScripts")
1. sudo wget https://s3.amazonaws.com/amazoncloudwatch-agent/debian/amd64/latest/amazon-cloudwatch-agent.deb
2. sudo dpkg -i -E ./amazon-cloudwatch-agent.deb
3. sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
4. Go through all the steps in the wizard (The result is saved here: /opt/aws/amazon-cloudwatch-agent/bin/config.json)
提示:我回答:
- Default to most questions and otherwise:
- NO --> Do you want to store the config in the SSM parameter store? (Because when I answered YES it failed later on because of some permission-issue and I didn't know how to make it happy and I don't think I need SSM in regards to this)
- YES --> Do you want to turn on StatsD daemon?
- YES --> Do you want to monitor metrics from CollectD?
- NO --> Do you have any existing CloudWatch Log Agent?
现在要防止此错误:解析/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml时出错,打开/usr/share/collectd/types.db:没有此类文件或目录 https://github.com/awsdocs/amazon-cloudwatch-user- guide/issues/1
Now to prevent this error: Error parsing /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml, open /usr/share/collectd/types.db: no such file or directory https://github.com/awsdocs/amazon-cloudwatch-user-guide/issues/1
5. sudo mkdir -p /usr/share/collectd/
6. sudo touch /usr/share/collectd/types.db
7. sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
8. /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status
{
"status": "running",
"starttime": "2020-06-07T10:04:41+00:00",
"version": "1.245315.0"
}
- https://www.screencast.com/t/42VWgoS88Z (创建IAM角色,添加策略并将其设置为服务器的默认角色.
- https://www.screencast.com/t/fAUUHCPe (CloudWatch新自定义指标)
- https://www.screencast.com/t/8J0Saw0co (数据匹配正常现在)
- https://www.screencast.com/t/x0PxOa799 (数据匹配正常现在)
- https://www.screencast.com/t/42VWgoS88Z (Create IAM role, add policies and make it the server default role).
- https://www.screencast.com/t/fAUUHCPe (CloudWatch new custom metrics)
- https://www.screencast.com/t/8J0Saw0co (data match OK now)
- https://www.screencast.com/t/x0PxOa799 (data match OK now)
我意识到-我第二次登录计算机时,CPU%的使用率从10%上升到30%并停留在该位置(当然可以预料会有一些增加-但我认为并没有那么多),这在我看来Case较早地解释了巨大的差异...老实说,我现在不知道这种方法是否比旧脚本更精确-但这应该是在2020年实现的正确方法:-)您将获得179个自定义指标在向导中选择高级"时(即使只有少数几个对大多数人有价值)
I realized - that the second I login to the machine the CPU % usage goes up from 10 % to 30% and stays there (of course some increase was to be expected - but not that much in my opinion) which in my case explains the big difference earlier...I honestly don't now if this way in more precise than the older script - but this should be the right way to do it in year 2020 :-) And you get access to 179 custom metrics when selecting "Advanced" during the wizard (even though only few would be valuable to most people)
这篇关于如何在Ubuntu上设置AWS CloudWatch的代理以获取(正确的)自定义指标,例如cpu,内存和磁盘使用率%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!