Say Hi to Shoppimon – Magento Monitoring for “Normal” People

For a while now I have been telling people I am “working on a small project” – and now is the time to unveil the mystery and introduce Shoppimon – a new start-up which I founded together with a small group of friends, and am currently spending most of my time around.

The idea of Shoppimon is simple – we want to provide Web monitoring and availability analysis which will be useable by, and useful to “normal” people – not only the tech guy, the programmer or the IT specialist, but the site owner, the business owner or even the marketing guy – in other words the real stake holder.

Continue reading

Storing Passwords the Right Way

I consider this post a bit of an experiment in writing about what I consider “beginner” material. Not that it is necessarily simple or easy stuff anyone should know, but simply because this is not a “new discovery” as far as I am concerned. Also, I usually try not to write about security related material, as I do not consider myself a security expert. However, since I’m starting to teach a “PHP 101″ course soon (maybe I’ll post more about it in the next few weeks), and since I was asked a few times about this topic recently, I’ve decided to write up my experience on this topic and test the reactions.

So, the topic in question is “what is the right way to store user passwords in my DB”. To be clear, I am talking specifically about the passwords users will use to log in to your application, not some 3rd party password you need to store for whatever reason. This is something almost any application out there requires – unless you interface with some external authentication mechanism (OAuth, openId, your office LDAP or Kerberos server), there’s a very high chance you’ll need to authenticate users against a self-stored user name and password.

Continue reading

My PHP Streams API article was published by php|architect

php|architect, one of the most prominent professional PHP magazines in the world, has published an article I wrote about PHP’s user-space Streams API in its December 2011 issue:

Go with the Flow: PHP’s Userspace Streams API

Almost every PHP application out there needs to read data from files or write data to files – or things that look like files but are not quite files – these unstructured blobs of data are commonly referred to as “streams”. Stream functions allow a scalable, portable and memory efficient way to handle data, and pretty much any PHP developer out there knows how to read data from or write data to a steam. The best part is that you don’t have to be an extension author in order to provide access to any data source as if it was just a regular file. PHP’s userspace streams API allows you do to exactly that, and this article will show you how.
If you’re a subscriber, feel free to read the article and send me your feedback. If not, go ahead an buy the issue :)

Replacing a lost SSH key on an Amazon EC2 machine

Due to an unfortunate shmelting accident (read: poor backup practices), I lost the SSH private key granting me the only way to access one of my EC2 hosted servers. Being unable to access the server, and unable to easily set a new public key through Amazon’s interfaces, I panicked for a few seconds. Then I started trying to hack my way in, and eventually found a way to set a new public key to my user. Here is what I did.

First, know that I was lucky: for this method to properly work, you need a few things:

  • The machine must be EBS based
  • You need to be able to afford a couple of minutes of downtime
  • You need to be able to withstand the effects of restarting the machine – for example, if you do not have an Elastic IP address associated with the machine, its public address will change. In some situations this is not acceptable.

After trying some different approaches, what worked for me was to do the following:

  1. Generate a new keypair for yourself, and import the public key to your EC2 account
  2. Start a new, clean, cheap machine (this will only be needed to do very simple things, so I recommend using a tiny machine) in the same availability zone as the affected machine
  3. Stop the affected machine (do not terminate, STOP it – this is only possible with EBS machines)
  4. Detach the root device from the affected machine (by default attached as /dev/sda1)
  5. Attach the detached device to the new clean machine
  6. SSH into the clean machine and mount the affected machine’s root filesystem somewhere (e.g. in /mnt/fs)
  7. Now you can edit /mnt/fs/root/.ssh/authorized_keys (or on official Ubuntu machines /home/ubuntu/.ssh/authorized_keys) and add your new public key to it
  8. Unmount the volume and terminate the clean machine – you no longer need it
  9. Re-attach the root device to the affected machine (which should be stopped) – ensure to attach it as the same device it was before (e.g. /dev/sda1)
  10. Re-start your old machine – you should now be able to use your new key!

Another approach which could work but I gave up on after a couple of attempts (I think it really depends on the init scripts in the machine you are using), is to stop the machine and change the User Data of it to a shell script that sets a new public key in the right place, then start it again.

And really, you should backup your keys!

Why I don’t like the term “NoSQL”

This is a rant post, but just to clarify things, it’s not a rant against the use of non-relational databases. I think that the shift in recent years from a world in which relational databases are used almost exclusively regardless of what the need is, to today’s situation where it is possible and even considered a good idea to choose the best fitting solution from any number of data storage paradigms, is a truly blessed change. I am a big fan of some non-relational database solutions, and to be honest as a programmer I enjoy using some of them more than I enjoy MySQL or any other relational database.

This is a rant against the too-common term “NoSQL”. In my opinion, “NoSQL” is an example of layman terminology which does not properly describe the concepts which in most cases it aims to describe, and should not be used by professionals which are technical enough to understand the true meaning of these concepts.

“NoSQL” databases are all about the data model – in most cases, the term is used to describe any kind of storage engine (or database) in which data is stored in non-relational manner: object storage, document storage, key-value storage etc. Indeed, the term is more about what the database is not that about what it is.

Relational data is data that can be described as a table – in contrast to what some think, the term “relational database” has nothing to do with the ability to define and enforce relationships between data in different tables. If this was the case, MySQL using the MyISAM storage engine would not be a relational database. The term “relation” is a mathematical term, which existed before the creation of relational databases and is used to describe a relationship between two finite data sets, which can be described in a tabular manner (and I am not a mathematician, not even close – so I apologize in advance for this likely inaccurate description).

But, SQL has nothing to do with this – SQL is the language used to send commands to the database, and nothing more. It is true that there is an almost 1-to-1 correlation between database engines that store data in a relational manner and database engines that use SQL as a query language, but saying that relational databases are SQL databases is like saying that  (and assume it’s 1984 again) the Russian language should be abolished when in fact we want to say that communism is an unfitting economic system. It’s a poor way to describe your intentions, and it makes you sound like an ignorant moron.

There are many client libraries and wrappers that allow you to query a relational database such as MySQL and Oracle without writing any SQL code yourself. This doesn’t make them NoSQL databases. Some popular non-relational databases, such as Amazon SimpleDB and the Google App Engine Data Store provide query languages that are quite similar to SQL. This doesn’t make them SQL databases.SQL is just a language, and it is a good one for what it’s supposed to do (putting aside all sorts of discrepancies between vendor-specific SQL implementations). SQL is not what NoSQL databases are NOT about.

So, next time when you want to use a term that describes all databases that do not store data in a tabular manner, use the term “non-relational” or if you really like acronyms, “NonRDBMS”, and not “NoSQL”. Or even better – use a term that describes what your preferred solution is, not what it is not. After all, when you say “non-relational storage engine”, you are probably not referring to your file system, right?

Bitbucket: Converting Hg repositories to Git

Recently I started using Bitbucket for private repository hosting for a project I’m working on. While I had no experience with Mercurial, I figured it can’t be that tricky – and Bitbucket offers free private hosting which is what this project needed (couldn’t go public, couldn’t pay, didn’t have the time to set up self-hosted SCM hosting).

All in all I like Bitbucket (although I have to admit on most aspects they seem to fall behind GitHub), but not so much using Mercurial – for all sorts of reasons it felt quirky and less polished than Git, which honestly I have much more experience with.

So following Bitbucket’s big announcement on Git support, I’ve decided to migrate my repositories from Hg to Git, while keeping them on Bitbucket and maintaining repository history. I’m happy to say it was relatively a piece of cake to successfully achieve. Here is what I did:

Step 1: Set up your repositories

First, I renamed my old Hg repository through Mercurial’s web interface, to something like “MyProject” to “MyProject Hg”. This changes the repository URL, but since I wasn’t planning on using it anymore that doesn’t really matters – plus you can always rename back if things go bad.

Then, I created a new Git repository with the name of the previous repository, e.g. “MyProject”. Again, that can be easily done from Bitbucket’s Web interface.

Step 2: Install the Mercurial hggit plugin

The hggit plugin allows your Mercurial command line tool hg to talk to git repositories – that is push and pull from Git. Installing it is easy, as it is probably available from your package manager. On Mac, if you use Macports, you can run:

  $ sudo port install py26-hggit

While on Ubuntu, run:

  $ sudo aptitude install mercurial-git

Then, make sure to load the plugin by adding the following lines to your ~/.hgrc file:

  [extensions]
  hggit=

Congratulations: Your hg command now speaks Git!

Step 3: Push your code into your Git repository

To push your code into your new Git repository, you basically need to run two commands:

First, create a Mercurial bookmark that references master to your default branch. This will help Git create the right refs later on:

  $ cd ~/myproject-hg-repo/
  $ hg bookmark -r default master

Next, simply push your code into the newly created Git repository:

  $ hg push git+ssh://git@bitbucket.org:shaharevron/myproject.git

Of course, make sure to change the repository URL to the URL of your new Git repository. To make sure hg understands you’re referring to a Git repository, if using SSH add the git+ssh:// prefix to the URL. This should push your entire repository to the new Git repository, and within a few seconds up to a few minutes (depending on how big your repository is), you should be able to see all your old commits in the new Git repo.

Step 4: Switch your local repository to use Git

Now that your new Git repo is up at Bitbucket, you’ll need to switch to using Git locally. There are two paths you can take here: the safe one, is to simply git clone your code to a new working directory and work from there. It’s safe, and will work well. However, if you’re a cowboy like me, and are too lazy to create a new IDE project on a different directory, you can in fact simply switch to working with Git on the same directory (but I still seriously recommend you ensure Bitbucket really does have your code as backup…).

Here is how to do it. From the local repository directory, run:

  $ git init
  $ git remote add origin git@bitbucket.org:shaharevron/myproject.git
  $ git pull origin master
  $ git reset --hard HEAD

Again, replace the repository URL with your own. This will “merge” everything in your Git repo into the local working directory. You will need to create a new .gitignore file if needed – and can now simply delete the .hg directory, as it is no longer needed. You can now happily use Git with your Bitbucket code.

While there shouldn’t be any problems, I also recommend keeping your old Hg repository around on Bitbucket for a few days, just to make sure nothing blows up – and delete it from Bitbucket’s web interface once you’re sure everything works well.

Goodbye, Zend

Ok, the title kind of says it all – this has been known to some for a few months now, but for the sake of clarifying things up, I’m leaving Zend – or to be technically accurate, has already left.

This was not an easy decision for me, as Zend has been for more than 6 years not only my employer but also my school, my workshop and a little bit of home as well. This sounds like bullshit – but since I started there as a first-level support engineer and am leaving as a co-Product Manager for the company’s flagship product, I think it’s fair to say I gained as much as I contributed.

However, it’s time to move on for me. I’m looking into doing my own things at my own time, and being my own boss. I want more free time to experiment, play and pursue my hobbies and silly ideas.

In recent years the company took some directions I was not 100% happy with, and being responsible for realizing some of these ideas, it was hard for me to stay at my role. I started thinking what I want to do next – and was offered a couple of very tempting roles within Zend – but then I realized that really, I want to make a bigger change.

And here I am.

For the last couple of months I have reduced my role at Zend to a 2-day consulting position, and will continue to consult Zend on a part-time basis for at least a few weeks.In the rest of my time, I plan to think, read, paddle, rest, blog more and work on some ideas I have. I plan to keep contributing to the PHP community, and specifically to Zend Framework 2.0.

Will I chicken out in 3 months and decide to get a real job again? Maybe… but I plan to make the most of my time until then.

Blog Moved

I’ve just moved my blog to a new address (as you can hopefully see). It’s much shorter and I hope that now that I don’t have to type that very long address every time I want to post something, I might blog a little more (say once every 4 months and not 6).

Once I’ll make sure everything works well in the new home, I’ll post some news – so stay tuned!

HTML 5 Canvas Game of Life

I recently started looking into different HTML 5.0 related technologies, one of the most exciting ones being the new Canvas tag and API.

As a little test, I’ve implemented a little Game of Life thing using HTML 5 Canvas, which you can see in action here: http://arr.gr/playground/life/ (view source to see the code behind it).

Game of Life in HTML5 Canvas

The algorithm is not very smart so it’s kind of slow and CPU intensive, but still fun to watch. It works nicely on Firefox 4.0, and latest Chrome and Safari versions, and a bit slow on Firefox 3.6. I did not test with any IE version but I do not expect it to work in IE 6 or 7, maybe 8 and probably 9.

I think Game of Life by itself is worth at least an entire post regardless of this HTML5 implementation, especially because I’m a big fan of things that bring CS and philosophy together, so I may write about it at a later point, but for now I suggest you let it run for a while (a few hundreds of generations) and see what you get :)

ZendCon 10 talk: Amazon Services in Zend Framework

Wow I haven’t posted in a while… I’m still at ZendCon in Santa Clara, and have just finished my last talk, which was about the different Zend Framework components that can be used to work with the Amazon Cloud Services of S3 and EC2.

Presentation was pretty good, although I had to hurry up in the end and skip some of the last slides.

The slides are now up in Slideshare, and can be downloaded or viewed on-line.