Background
EC2's default CloudWatch metrics do not include disk usage and memory usage, which are the two most common causes of failures.
Implementation Process
- Write .ebextensions configuration โ Create installation and configuration scripts for CloudWatch Agent
- Configure collection metrics โ Disk usage (by mount point), memory usage, available memory
- Configure CloudWatch Alarm โ Disk > 85% alert, memory > 90% alert
- Aggregate by ASG dimension โ New instances automatically included in monitoring
- Alert notification โ SNS -> Lambda -> DingTalk Webhook
Results
- Full coverage across all environments, monthly cost $5-10
- Captured a disk-nearly-full alert in the first week after launch
Technical Points
- .ebextensions is the standard way to customize Beanstalk environments
- Configure alarms by ASG dimension to avoid alarm failure after instance replacement โ ClawNOC Operations Agent Practice Notes