For our MySQL databases on EC2, we back up the data by taking hourly snapshots of the volume that stores the MySQL data folder. This is a common practice, and in fact we do it using the popular ec2-consistent-snapshot script by Alestic. For backups this works great; in addition, having volume snapshots of MySQL data is very useful when we want to quickly launch a new MySQL server with a fairly recent state of our data – whether as a way to speed up synchronization of a new DB node or for testing purposes.
When launching a new EC2 instance we simply mount a new volume created from the latest DB snapshot on /var/lib/mysql and then start MySQL, and everything “just works” from there.
However, there are some quirks to this method: recently, I’ve encountered some errors from daily logrotate tasks which fail to flush the logs on a post-rotate script (this is from the logs of cron that runs logrotate):
At Shoppimon we’ve been relying a lot on Amazon infrastructure – it may not be the most cost effective option for larger, more stable companies but for small start-ups that need to be very dynamic, can’t have high up-front costs and don’t have a large IT department its a great choice. We do try to keep our code clean from any vendor-specific APIs, but when it comes to infrastructure & operations management, AWS (with help from tools like Chef) has been great for us.
One of the AWS tools we use is CloudWatch – it allows us to monitor our infrastructure and get alerted when things go wrong. While its not the most flexible monitoring tool out there, it takes care of most of what we need right now and has the advantage of not needing to run an additional server and configure tools such as Nagios or Cacti. With its custom metrics feature, we can even send some app-level data and monitor it using the common Amazon set of tools.
However, there’s one big missing feature in CloudWatch: it doesn’t monitor your instance memory utilization. I suppose Amazon has all sorts of technical reasons not to provide this very important metric out of the box (probably related to the fact that their monitoring is done from outside the instance VM), but really if you need to monitor servers, in addition to CPU load and IO, memory utilization is one of the most important metrics to be aware of.
Working quite a lot on the Ubuntu Server EC2 images, I am often faced with a need to create one or more additional admin users, which have the same permissions as the first user (“ubuntu” in these images). I did a little bit of searching but didn’t find any way to easily add a new user with the same groups as the Ubuntu user, so I crafted a little command. I’m pasting it here mostly for future self reference, and also in hope this helps someone:
Of course ‘ubuntu’ is the user you want to copy, and ‘newuser’ is the name of the new user.
Note that the new user will be in the admin group but will still require a password when using sudo (that’s because in the EC2 images ‘ubuntu’ is the only user with NOPASSWD privileges. I personally believe this is a good thing, but if you want you can always add NOPASSWD on the admin group in /etc/sudoers.
Due to an unfortunate shmelting accident (read: poor backup practices), I lost the SSH private key granting me the only way to access one of my EC2 hosted servers. Being unable to access the server, and unable to easily set a new public key through Amazon’s interfaces, I panicked for a few seconds. Then I started trying to hack my way in, and eventually found a way to set a new public key to my user. Here is what I did.
First, know that I was lucky: for this method to properly work, you need a few things:
The machine must be EBS based
You need to be able to afford a couple of minutes of downtime
You need to be able to withstand the effects of restarting the machine – for example, if you do not have an Elastic IP address associated with the machine, its public address will change. In some situations this is not acceptable.
After trying some different approaches, what worked for me was to do the following:
Generate a new keypair for yourself, and import the public key to your EC2 account
Start a new, clean, cheap machine (this will only be needed to do very simple things, so I recommend using a tiny machine) in the same availability zone as the affected machine
Stop the affected machine (do not terminate, STOP it – this is only possible with EBS machines)
Detach the root device from the affected machine (by default attached as /dev/sda1)
Attach the detached device to the new clean machine
SSH into the clean machine and mount the affected machine’s root filesystem somewhere (e.g. in /mnt/fs)
Now you can edit /mnt/fs/root/.ssh/authorized_keys (or on official Ubuntu machines /home/ubuntu/.ssh/authorized_keys) and add your new public key to it
Unmount the volume and terminate the clean machine – you no longer need it
Re-attach the root device to the affected machine (which should be stopped) – ensure to attach it as the same device it was before (e.g. /dev/sda1)
Re-start your old machine – you should now be able to use your new key!
Another approach which could work but I gave up on after a couple of attempts (I think it really depends on the init scripts in the machine you are using), is to stop the machine and change the User Data of it to a shell script that sets a new public key in the right place, then start it again.
Wow I haven’t posted in a while… I’m still at ZendCon in Santa Clara, and have just finished my last talk, which was about the different Zend Framework components that can be used to work with the Amazon Cloud Services of S3 and EC2.
Presentation was pretty good, although I had to hurry up in the end and skip some of the last slides.