At Shoppimon we’ve been relying a lot on Amazon infrastructure – it may not be the most cost effective option for larger, more stable companies but for small start-ups that need to be very dynamic, can’t have high up-front costs and don’t have a large IT department its a great choice. We do try to keep our code clean from any vendor-specific APIs, but when it comes to infrastructure & operations management, AWS (with help from tools like Chef) has been great for us.
One of the AWS tools we use is CloudWatch – it allows us to monitor our infrastructure and get alerted when things go wrong. While its not the most flexible monitoring tool out there, it takes care of most of what we need right now and has the advantage of not needing to run an additional server and configure tools such as Nagios or Cacti. With its custom metrics feature, we can even send some app-level data and monitor it using the common Amazon set of tools.
However, there’s one big missing feature in CloudWatch: it doesn’t monitor your instance memory utilization. I suppose Amazon has all sorts of technical reasons not to provide this very important metric out of the box (probably related to the fact that their monitoring is done from outside the instance VM), but really if you need to monitor servers, in addition to CPU load and IO, memory utilization is one of the most important metrics to be aware of.
So, with a little bit of research I’ve found some scripts that utilize the CloudWatch API to send memory utilization info as a custom metric to AWS. However, most of these scripts require that you provide some kind of credentials (API keys) and I feel really uncomfortable storing and managing API keys on all sorts of different machines, even with automation tools like Chef. The less I have to do it, the better. Amazon has a pretty nice answer for that - IAM Roles which allow to authorize access to specific AWS services (including S3 and CloudWatch) on an EC2 instance basis. Since we want all instances to be able to do certain things (like send their own metrics to CloudWatch or access our EC2 hosted private DEB repo), all our EC2 servers get some permissions via IAM roles. But I couldn’t find any solution that supports IAM roles and does the job right.
So, I did a little bit of Python hacking using the wonderful boto library, I came up with this tiny utility that grabs memory and swap utilization percentage and sends it to CloudWatch as a custom metric. It relies on the machine having an IAM role set up, but I’m pretty sure that if you don’t want to use IAM Roles, you can simply create a boto config file with your AWS credentials instead.
You can install it like so if you want it running via cron every minute (note that we use ‘nobody’ as a user, if you rely on a ~/.boto config file you may want to adjust):
$ curl https://gist.githubusercontent.com/shevron/6204349/raw/cw-monitor-memusage.py | sudo tee /usr/local/bin/cw-monitor-memusage.py $ sudo chmod +x /usr/local/bin/cw-monitor-memusage.py $ echo "* * * * * nobody /usr/local/bin/cw-monitor-memusage.py" | sudo tee /etc/cron.d/cw-monitor-memusage
And that’s it – memory usage stats should now appear in your CloudWatch console and you can create alarms based on them. Note that you may need to enable advanced monitoring on instances for this to work – this comes at an additional small cost. I’m not sure if this required or not, you can try and see.
Feel free to use this little script for any purpose. If you improve it, please let us know!
EDIT: fixed the gist URL