EC2 process issues detection and alerting
Process issues detection on EC2 instances
This stack would deploy the following resources in target AWS account
AWS SSM documents – These documents are used to configure CloudWatch agent on Linux and Windows hosts
Lambda function, IAM role for the lambda function and CloudWatch log group
CloudWatch alarms for each process that need to be monitored
SNS topics/subscriptions
Usage
Prerequisites - make sure the target instance/s are managed through aws ssm and has an IAM role with a policy which capable of writing metrics data to CloudWatch and read files from config data s3 bucket (CloudWatchAgentAdminPolicy, CloudWatchLogsFullAccess, AmazonS3ReadOnlyAccess).
Note that this solution does not require for you to have SSH/RDP keys/credentials to deploy the configurations.
Navigate to the configs directory and modify the config.json file in the linux/windows directories based on the OS/platform of the target instance.
This CloudWatch agent configuration file is a JSON file with two sections: agent and metrics.
The agent section includes fields for the overall configuration of the agent.
The metrics section specifies the custom metrics for collection and publishing to CloudWatch.
To add a new process to be monitored, add a new stanza under “procstat” object. The region information will get appended to the configuration file automatically upon running the deployment script.
Configuration file for linux instances
Provide the pid file name of the process you want to monitor in the configuration file.
{
"agent": {
"run_as_user": "cwagent",
"region": "us-east-1"
},
"metrics": {
"namespace": "procstatpoc",
"metrics_collected": {
"procstat": [
{
"pid_file": "/var/run/nginx.pid",
"measurement": [
"cpu_usage",
"memory_rss"
]
}
]
}
}
}
Configuration file for windows instances
Provide the exe file name of the process you want to monitor in the configuration file.
{
"agent": {
"run_as_user": "cwagent",
"region": "us-east-1"
},
"metrics": {
"namespace": "procstatpoc",
"metrics_collected": {
"procstat": [
{
"exe": "java",
"measurement": [
"cpu_usage",
"memory_rss"
]
}
]
}
}
}
To deploy the changes, run the deploy.sh script. It requires to provide the target account id, region, instance id and platform type of the instance.
This will deploy/update the resources mentioned above as needed and configure the CloudWatch agent according to the provided configuration files.
The metric data can be found under the “procstatpoc” custom namespace in CloudWatch metrics, which get created once the agent configuration get completed.
If a configured process failed, an email notification will get send to the target audience
Comments
Post a Comment