Due to an unfortunate shmelting accident (read: poor backup practices), I lost the SSH private key granting me the only way to access one of my EC2 hosted servers. Being unable to access the server, and unable to easily set a new public key through Amazon’s interfaces, I panicked for a few seconds. Then I started trying to hack my way in, and eventually found a way to set a new public key to my user. Here is what I did.
First, know that I was lucky: for this method to properly work, you need a few things:
- The machine must be EBS based
- You need to be able to afford a couple of minutes of downtime
- You need to be able to withstand the effects of restarting the machine – for example, if you do not have an Elastic IP address associated with the machine, its public address will change. In some situations this is not acceptable.
After trying some different approaches, what worked for me was to do the following:
- Generate a new keypair for yourself, and import the public key to your EC2 account
- Start a new, clean, cheap machine (this will only be needed to do very simple things, so I recommend using a tiny machine) in the same availability zone as the affected machine
- Stop the affected machine (do not terminate, STOP it – this is only possible with EBS machines)
- Detach the root device from the affected machine (by default attached as /dev/sda1)
- Attach the detached device to the new clean machine
- SSH into the clean machine and mount the affected machine’s root filesystem somewhere (e.g. in /mnt/fs)
- Now you can edit /mnt/fs/root/.ssh/authorized_keys (or on official Ubuntu machines /home/ubuntu/.ssh/authorized_keys) and add your new public key to it
- Unmount the volume and terminate the clean machine – you no longer need it
- Re-attach the root device to the affected machine (which should be stopped) – ensure to attach it as the same device it was before (e.g. /dev/sda1)
- Re-start your old machine – you should now be able to use your new key!
Another approach which could work but I gave up on after a couple of attempts (I think it really depends on the init scripts in the machine you are using), is to stop the machine and change the User Data of it to a shell script that sets a new public key in the right place, then start it again.
And really, you should backup your keys!