<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pseudo Random Bytes</title>
	<atom:link href="http://arr.gr/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://arr.gr/blog</link>
	<description>Shahar writes about the Web, programming and other stuff</description>
	<lastBuildDate>Sat, 02 Feb 2013 13:55:31 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5</generator>
		<item>
		<title>Generators in PHP 5.5</title>
		<link>http://arr.gr/blog/2013/02/generators-in-php-5-5/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=generators-in-php-5-5</link>
		<comments>http://arr.gr/blog/2013/02/generators-in-php-5-5/#comments</comments>
		<pubDate>Sat, 02 Feb 2013 13:55:31 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[generators]]></category>
		<category><![CDATA[iterators]]></category>
		<category><![CDATA[php55]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=252</guid>
		<description><![CDATA[Now that PHP 5.5 alpha versions are being released, I decided to grab the latest PHP source from GitHub, build it and give the new Generators feature a spin. I have used generators in the past in Python, and was excited to hear they are coming to PHP. While they are useful mostly in advanced [...]]]></description>
				<content:encoded><![CDATA[<p>Now that PHP 5.5 alpha versions are being released, I decided to grab the latest PHP source from GitHub, build it and give the new <a href="http://php.net/manual/en/language.generators.overview.php">Generators</a> feature a spin. I have used generators in the past in <a href="http://wiki.python.org/moin/Generators">Python</a>, and was excited to hear they are coming to PHP. While they are useful mostly in advanced use cases they can make a lot of simple use cases much more efficient, and I think its a handy addition to the advanced PHP programmer&#8217;s toolbox.</p>
<h2>What are Generators?</h2>
<p>I like to describe Generators as special functions which are iterable and maintain state. Think of a function that instead of returning once and destroying its state (local variables) after returning, can return multiple times, while maintaining the state of local variables, thus allowing iteration over an instance of that function state. In fact, a call to a generator function creates a special <em>Generator</em> object which can be iterated. The object maintains the internal state of the generator, and on each iteration generates a new value. The same result can be achieved by implementing a <em>Traversable</em> class, but with much less code.</p>
<p>This is very different from the way we are used to think of functions, so maybe an example is the best way to demonstrate this. I will use a simplified example based on the one given in the documentation:</p>
<pre class="brush: php; title: ; notranslate">

function xrange($start, $end, $step = 1)
{
  for ($i = $start; $i &lt;= $end; $i += $step) {
    yield $i;
  }
}

$start = microtime(true);
foreach (xrange(0, 1000000) as $i) {
  // do nothing
}
$end = microtime(true);

echo &quot;Total time: &quot; . ($end - $start) . &quot; sec\n&quot;;
echo &quot;Peak memory usage: &quot; . memory_get_peak_usage() . &quot; bytes\n&quot;;

</pre>
<p>In the example above, the <em>xrange</em> function is a Generator which operates in a similar yet simplified version of the <a href="http://php.net/range"><em>range()</em></a> PHP function (just like in Python!). The main thing to notice is the <em>yield</em> keyword &#8211; this tells the function to <em>yield</em> a value &#8211; which means a value is &#8220;returned&#8221; but the state of the generator is maintained.</p>
<p>When iterating over a generator function, as you can see in the <em>foreach</em> loop, iteration continues as long as a value is yielded. Once the function returns without yielding (as xrange in our example would do once the inner <em>for</em> loop is done), iteration stops. We get a behaviour which is (almost) equivalent to <em>range</em> in the sense that it allows us to iterate over numbers &#8211; but, without allocating the entire array of numbers in advance. In our example, we save a lot of memory and in fact execution is faster when a generator is used.</p>
<p>To demonstrate, here is the output of the script above (ok, I added some formatting to the output, but the results are real!):</p>
<pre>$ /usr/local/bin/php /tmp/with-generators.php
Total time: 0.20149302482605 sec
Peak memory usage: 234,256 bytes</pre>
<p>This is on a one-million integers &#8220;array&#8221; (unlike <em>range, </em>no real array is allocated so we can&#8217;t do random access on members, but during iteration it behaves just like an array).</p>
<p>By comparison, executing the same code with <em>range</em>() instead of <em>xrange()</em>, results in the following:</p>
<pre>$ /usr/local/bin/php /tmp/without-generators.php
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 32 bytes) in /private/tmp/generators.php on line 12</pre>
<p>Ok, we reach our memory limit. Lets try to go crazy (not a good idea in production):</p>
<pre>$ /usr/local/bin/php -d memory_limit=200M /tmp/without-generators.php
Total time: 0.31754398345947 sec
Peak memory usage: 144,617,256 bytes</pre>
<p>After increasing the memory limit to 200 MB, the script runs: but it takes longer (honestly, to my surprise), and consumes an order of magnitude more memory.</p>
<p>Pretty cool, huh?</p>
<p>Just to demonstrate, calling <em>var_dump</em> on a generator would result in this:</p>
<pre class="brush: php; title: ; notranslate">

var_dump(xrange(0, 100));
// Output:
// object(Generator)#2 (0) {
// }

</pre>
<h2>But I can do the same thing with Iterator interfaces, no?</h2>
<p>Yes! pretty much anything you can do with Generators can be done by creating class which implements either the <em>Iterator</em> or <em>IteratorAggregate</em> interfaces. But in many cases, a lot of boilerplate code can be removed if a Generator is used instead. For example, a class equivalent to the <em>xrange</em> generator above would look like this:</p>
<pre class="brush: php; title: ; notranslate">

class XrangeObject implements Iterator
{
  private $value = 0;
  private $start = 0;
  private $end   = 0;
  private $step  = 1;

  public function __construct($start, $end, $step = 1)
  {
    $this-&gt;value = (int) $start;
    $this-&gt;start = (int) $start;
    $this-&gt;end   = (int) $end;
    $this-&gt;step  = (int) $step;
  }

  public function rewind()
  {
    $this-&gt;value = $this-&gt;start;
  }

  public function current()
  {
    return $this-&gt;value;
  }

  public function key()
  {
    return $this-&gt;value;
  }

  public function next()
  {
    return ($this-&gt;value += $this-&gt;step);
  }

  public function valid()
  {
    return $this-&gt;value &lt;= $this-&gt;end;
  }
}

$start = microtime(true);
$xrange = new XRangeObject(0, 1000000);
foreach ($xrange as $i) {
  // do nothing
}
$end = microtime(true);

echo &quot;Total time: &quot; . ($end - $start) . &quot; sec\n&quot;;
echo &quot;Peak memory usage: &quot; . memory_get_peak_usage() . &quot; bytes\n&quot;;
</pre>
<p>Wow, that&#8217;s much more code for something we achieved very simply with a generator. BTW, the results are:</p>
<pre class="brush: plain; title: ; notranslate">

$ /usr/local/bin/php /tmp/with-iterator.php
Total time: 0.61971187591553 sec
Peak memory usage: 240,968 bytes

</pre>
<p>As you can see, memory usage is comparable to a Generator. Run time is more than 3 times slower, but in most realistic use cases this time is usually negligible &#8211; in any case unless we would have seen an order of magnitude of difference, performance is not a major issue here. The interesting thing really is the amount of boilerplate code we had to use when creating an iterator &#8211; most of this code is just generic boring stuff and not what we really care about. With Generators, the implementation is much shorter.</p>
<h2>How about a realistic use case?</h2>
<p>Ok, so we have used a generator to iterate over numbers. Woopti-doo. We can just drop the generator and use the <em>for</em> loop inside it to achieve the same thing. How about a more realistic use case?</p>
<p>Take a look at the following example, which I believe can be pretty useful and still has fairly straightforward code: a generator which combines the efficiency of <em>XMLReader</em> with the simple API of <em>SimpleXML</em> to bring you an efficnet yet easy to use XML reader function for possibly large XML streams with repeating structure &#8211; for example, RSS or Atom feeds.</p>
<pre class="brush: php; title: ; notranslate">

function xml_stream_reader($url, $element)
{
  $reader = new XMLReader();
  $reader-&gt;open($url);

  while (true) {
    // Skip to next element
    while (! ($reader-&gt;nodeType == XMLReader::ELEMENT &amp;&amp; $reader-&gt;name == $element)) {
      if (! $reader-&gt;read()) break(2);
    }

    if ($reader-&gt;nodeType == XMLReader::ELEMENT &amp;&amp; $reader-&gt;name == $element) {
      yield simplexml_load_string($reader-&gt;readOuterXml());
      $reader-&gt;next();
    }
  }
}
</pre>
<p>The <em>xml_stream_reader()</em> generator defined above will use XMLReader to open and read from an XML stream. Unlike PHP&#8217;s <em>SimpleXML </em>or <em>DOM</em> extensions, it will not read an entire XML document into memory, thus avoiding potential blowups on very large XML files. To keep things simple for the user however, whenever it encounters the XML element searched by the user (e.g. the <em>item</em> element in RSS feeds), it will read the entire element into memory (assume each item is small but there are potentially thousands of items) and return it as a <em>SimpleXMLElement</em> object &#8211; thus still providing the ease of use of <em>SimpleXML</em> for the consumer.</p>
<p>Here is how it can be used:</p>
<pre class="brush: php; title: ; notranslate">

$feed = xml_stream_reader('http://news.google.com/?output=rss&amp;num=100', 'item');
foreach($feed as $itemXml) {
  echo $itemXml-&gt;title . &quot;\n&quot;;
}

</pre>
<p>While I couldn&#8217;t find a large-enough XML file to test this on, even with 2mb files, this can be much more efficient than DOM or SimpleXML, and without too much more coding.</p>
<p>So I&#8217;m really happy about the addition of generators &#8211; it&#8217;s a cool feature. Not one you&#8217;d use every day, but in some places where complex Iterators had to be implemented (and where OO features such as polymorphism are not required), generators can be a real neat, concise and maintainable solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2013/02/generators-in-php-5-5/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Serving ZF apps with the PHP 5.4 built-in Web Server</title>
		<link>http://arr.gr/blog/2012/08/serving-zf-apps-with-the-php-54-built-in-web-server/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=serving-zf-apps-with-the-php-54-built-in-web-server</link>
		<comments>http://arr.gr/blog/2012/08/serving-zf-apps-with-the-php-54-built-in-web-server/#comments</comments>
		<pubDate>Sat, 18 Aug 2012 09:36:45 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[built-in-ws]]></category>
		<category><![CDATA[learnsomething]]></category>
		<category><![CDATA[PHP 5.4]]></category>
		<category><![CDATA[zf]]></category>
		<category><![CDATA[zf2]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=246</guid>
		<description><![CDATA[When teaching PHP to newcomers, I have found that (honestly to my surprise) one of the biggest barriers you have to cross is setting the stack up to serve PHP files properly, especially when it comes to Zend Framework apps and other rewrite rule based MVC applications. Even with strong development background, the idea of [...]]]></description>
				<content:encoded><![CDATA[<p>When teaching PHP to newcomers, I have found that (honestly to my surprise) one of the biggest barriers you have to cross is setting the stack up to serve PHP files properly, especially when it comes to Zend Framework apps and other rewrite rule based MVC applications. Even with strong development background, the idea of setting up Web Server configuration to get things working seems foreign to many.</p>
<p>Even as an experienced developer with good knowledge of the LAMP stack setup, setting up new vhosts and other configuration for each new project is sometimes a pain in the ass.</p>
<p>There are of course good news &#8211; starting from PHP 5.4, the Command Line Interface (CLI) version of PHP comes with a build-in Web server that can be used to serve PHP apps in development. This Web server is very easy to use &#8211; you just fire it up in the right place and it works, serving your PHP files. While it is by no means a viable production solution (it is a sequential, no-concurrency server meaning it will only serve one request at a time), it is very convenient for development purposes.</p>
<p>While it &#8220;just works&#8221; for simple &#8220;1-to-1 URL &lt;-&gt; File&#8221; apps, it can work almost as easily for rewrite based MVC apps, including Zend Framework 1.x and 2.x apps and probably for other frameworks as well.</p>
<p><span id="more-246"></span>As a start, here is how you launch the Built-in Web server (assuming you have PHP 5.4 in your PATH):</p>
<pre class="brush: plain; title: ; notranslate">
$ php -S localhost:8080

</pre>
<p>The -S flag means &#8220;start the built-in server and listen on localhost port 8080&#8243; (which of course can be changed, but usually in development you want it listening to local requests only). You can always hit Ctrl+C to quit the server.</p>
<p>This tells PHP to treat the current directory as the document root, and serve files as they are requested &#8211; basically if $_SERVER['REQUEST_URI'] matches a local file it will be served (static file extensions are identified and sent as-is, while other files are parsed as PHP files and are executed). If the request URI does not match a local file, a 404 error will be returned.</p>
<p>This works well for simple no-rewrite apps &#8211; but what about more complex MVC apps? That&#8217;s easy to &#8211; all you have to do is give the Web server a file which will be used as a catch-all &#8220;router&#8221; script, so in a ZF app for example, you can do the following:</p>
<pre class="brush: plain; title: ; notranslate">
$ php -S localhost:8080 public/index.php

</pre>
<p>This means all requests will now go through public/index.php. If you try it with your ZF app, you will see it works almost perfectly &#8211; ZF views are rendered very nicely. The only problem is that static file references (images, CSS files and JavaScript mostly) do not work. That&#8217;s because these requests are routed through index.php as well &#8211; and it does not know how to handle those.</p>
<p>Of course, you could write a complex plugin to intercept such requests and handle them from within ZF, serving static content using, for example, PHP&#8217;s readfile() function. But this seems like a huge overhead for something which is only needed in development environments (in production, you will have the proper rewrite rules set up for your production-grade web server of choice, of course).</p>
<p>My solution came from reading a little bit more about the built-in Web server. It turns out that if the router script returns FALSE, the web server will fall back to serving the file directly, as if no router script was specified in the command line. So I wrote this tiny little wrapper PHP script that I now use in my ZF app. You can even include it in the source tree as it is a virtually harmless file to have even in a production environment:</p>
<p><script src="https://gist.github.com/3385625.js"> </script></p>
<p>Save this file in your project as public/builtin-ws-wrapper.php and then run the built-in Web server like so:</p>
<pre class="brush: plain; title: ; notranslate">
$ php -S localhost:8080 public/builtin-ws-wrapper.php

</pre>
<p>In a way very similar to how the recommended ZF rewrite rules work, the script checks if the file being requested is a real file (and is under the document root) and if so returns FALSE &#8211; this tells the built-in Web server to handle it from here. Otherwise, it simply includes the framework&#8217;s main entry file, index.php, and lets Zend Framework take care of the rest. </p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/08/serving-zf-apps-with-the-php-54-built-in-web-server/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>On PHP Extensions</title>
		<link>http://arr.gr/blog/2012/06/on-php-extensions/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=on-php-extensions</link>
		<comments>http://arr.gr/blog/2012/06/on-php-extensions/#comments</comments>
		<pubDate>Sat, 16 Jun 2012 21:33:46 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Linux & FOSS]]></category>
		<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[c]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[extensions]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=241</guid>
		<description><![CDATA[While teaching PHP I mention the term &#8220;extension&#8221; quite a lot &#8211; but I have realized that this may very well be a confusing term for non-PHPers. While most PHP training courses focus on code, I believe getting to know the PHP engine and environment is almost as important as learning to use the language. [...]]]></description>
				<content:encoded><![CDATA[<p>While teaching PHP I mention the term &#8220;extension&#8221; quite a lot &#8211; but I have realized that this may very well be a confusing term for non-PHPers. While most PHP training courses focus on code, I believe getting to know the PHP engine and environment is almost as important as learning to use the language. Extensions are a big and important part of PHP, but they seem to be a big knowledge gap about them and Googling for articles that explain what extensions are, and how they are used and installed, hardly returns good results. So, as part of my attempt to blog about more basic PHP topics, I&#8217;ve decided to try and come up with an overview of PHP extensions.</p>
<h2>So what are PHP extensions?</h2>
<p>An <em>extension</em> in PHP is in fact a module providing some functionality to the PHP Engine. While the term makes it sound like extensions provide some kind of special functionality, in reality many of the language&#8217;s most basic functions and classes are provided in extensions. Many extensions are shipped as part of the default PHP distribution, and some are in fact compiled into PHP in such way that they cannot even be unloaded. Come to think of it, perhaps it is best to think of PHP extensions as &#8220;language modules&#8221;.</p>
<p><span id="more-241"></span></p>
<p>Another important aspect of Extensions (in the PHP world) is that like PHP they are written in C (and on rare occasions C++) and are compiled and loaded into PHP as a shared object or DLL, or compiled statically into the PHP binary. This is in contrast with libraries, components and frameworks which in most cases are written and distributed as PHP code.</p>
<p>Some examples you are most likely familiar with are the <em>mysql</em> extension providing all the <em>mysql_* </em>functions, or the <em>gd</em> extension providing image processing functions. Both of these extensions are almost always included with the default PHP distribution. More examples include the <em>session </em>and <em>SPL </em>extensions, which are hardly separable from PHP but are considered extensions because mechanically they use PHP&#8217;s internal <em>Extension API</em>.</p>
<p>Most extensions provide new user-space API (PHP functions, classes and constants). Some extensions may not provide new APIs directly, but interact with other extensions or with PHP&#8217;s core to provide added functionality &#8211; for example, the PDO extension provides new API, but the PDO_MySQL and PDO_OCI extensions do not &#8211; they merely add capabilities to the PDO extension.</p>
<h2>How are extensions shipped?</h2>
<p>As mentioned above, PHP extensions are written in C or C++, which means they are usually shipped as C/C++ source code and sometimes as pre-compiled, OS and CPU architecture specific binaries. Many extensions are shipped with PHP &#8211; and if they are not already enabled, all you need to do is enable them in order to use them (more on that later).</p>
<p>There are also many extensions which are not shipped with PHP but are developed in PHP&#8217;s official extension repository, called <a href="http://pecl.php.net/">PECL</a> (the PHP Extension Community Library). PECL extensions can be downloaded, compiled, and added to your PHP installation provided that you are running the right version of PHP. This can be done automatically using a command line tool called &#8216;<em>pecl&#8217;</em>, or can be done manually using a set of standard build tools. For Windows users, you may compile extensions manually but pre-built extension DLLs can sometimes be found around the Internet for many of the popular extensions.</p>
<p>If you have the right build tool chain installed, installing an extension using <em>pecl </em>is usually dead simple:</p>
<pre class="brush: plain; title: ; notranslate">
$ pecl install apc
</pre>
<p>And that&#8217;s more or less it. If your extension builds on a 3rd party C library (as do many extensions), you will need to have that library installed as well.</p>
<p>In order to be accepted into PECL, extensions must adhere to certain standards set by the PHP development team. This means you can usually trust PECL extensions to work well, provided that they are actively maintained and are considered stable by their development teams.</p>
<p>For example, the APC, gnupg and sphinx extensions are all relatively popular extensions provided through PECL.</p>
<p>In addition to PECL, PHP extensions may be development outside of PECL &#8211; sometimes due to legal / licensing reasons, and sometimes for other reasons. You may find some PHP extensions outside of PECL &#8211; but it is recommended to be more cautious with PHP extensions not available through PECL, as these may not adhere to the standards of the PHP development team.</p>
<h2>Enabling and Disabling PHP extensions</h2>
<p>Once you have an extension installed (this usually mean you have this extension&#8217;s DLL or .so file placed in your PHP&#8217;s extension directory), you will need to enable it in order to use it.</p>
<p>First, check the output of <em>phpinfo()</em> to see if the extension is not already enabled.</p>
<p>If it is not enabled, you will need to edit your <em>php.ini</em> and make sure that the following is set:</p>
<pre class="brush: plain; title: ; notranslate">
extension_dir=&lt;path to the directory containing your extensions&gt;
extension=myext.so
</pre>
<p>Where &#8220;myext.so&#8221; is the file name, without the path, of the extension&#8217;s .so or DLL file. You will notice your php.ini probably includes multiple <em>extension= </em>lines. This is fine &#8211; there should be one for each extension you want to load. You can of course comment out lines referring to extensions you don&#8217;t need by adding a semicolon (;) at the beginning of the line.</p>
<p>After making changes to your php.ini, you need to restart your Web server (or your PHP process pool if using FPM / FastCGI) to apply the changes.</p>
<p>To ensure your new extension is loaded, check <em>phpinfo()</em> again. Check your PHP error log (or Web server error log) for any errors during restart which might indicate a problem loading the extension.</p>
<p>For production servers, it is recommended that you disable any extension which you do not use. This may save you a little bit of server memory and CPU cycles, and most of all will reduce the chances of unknown bugs or security issues in those extensions affecting your server.</p>
<p>Keep in mind that some extensions may be build statically into PHP. You will see these listed as extensions in php.ini but you will not be able to disable them, and in most cases you will not see an <em>extension= </em>line referring to them in php.ini or an .so / .DLL files. Removing statically compiled extensions requires recompiling PHP itself, and in most cases this is hardly needed as most statically compiled extensions tend to include core functionality which rarely needs to be removed.</p>
<h2>Creating Extensions</h2>
<p>If you are a C/C++ programmer and want to create your own extension, you will need to learn about the Zend Engine 2 API. Unfortunately, there isn&#8217;t a lot of documentation on the subject, and what is available as part of the PHP manual has long been outdated.</p>
<p>It is recommended you check out <a href="http://blog.golemon.com/">Sara Golemon&#8217;s</a> excellent book <a href="http://www.amazon.com/Extending-Embedding-PHP-Sara-Golemon/dp/067232704X">Extending and Embedding PHP</a>. It has been around for quite a few years now and does not cover the most recent APIs, but it is definitely a good starting point. In addition, you should look for extension creating presentations from recent years PHP conferences &#8211; almost every conference has one of these by someone from the PHP Core team. Of course, looking at other extensions source code is always the best place to learn <img src='http://arr.gr/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/06/on-php-extensions/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Quickly Creating a New Admin User on Ubuntu</title>
		<link>http://arr.gr/blog/2012/03/quickly-creating-a-new-admin-user-on-ubuntu/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=quickly-creating-a-new-admin-user-on-ubuntu</link>
		<comments>http://arr.gr/blog/2012/03/quickly-creating-a-new-admin-user-on-ubuntu/#comments</comments>
		<pubDate>Thu, 08 Mar 2012 10:11:31 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Linux & FOSS]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=237</guid>
		<description><![CDATA[Working quite a lot on the Ubuntu Server EC2 images, I am often faced with a need to create one or more additional admin users, which have the same permissions as the first user (&#8220;ubuntu&#8221; in these images). I did a little bit of searching but didn&#8217;t find any way to easily add a new [...]]]></description>
				<content:encoded><![CDATA[<p>Working quite a lot on the Ubuntu Server EC2 images, I am often faced with a need to create one or more additional admin users, which have the same permissions as the first user (&#8220;ubuntu&#8221; in these images). I did a little bit of searching but didn&#8217;t find any way to easily add a new user with the same groups as the Ubuntu user, so I crafted a little command. I&#8217;m pasting it here mostly for future self reference, and also in hope this helps someone:</p>
<pre class="brush: bash; title: ; notranslate">
sudo useradd -m -G `groups ubuntu | cut -d&quot; &quot; -f4- | sed 's/ /,/g'` -s/bin/bash newuser
</pre>
<p>Of course &#8216;ubuntu&#8217; is the user you want to copy, and &#8216;newuser&#8217; is the name of the new user.</p>
<p>Note that the new user will be in the admin group but will still require a password when using sudo (that&#8217;s because in the EC2 images &#8216;ubuntu&#8217; is the only user with NOPASSWD privileges. I personally believe this is a good thing, but if you want you can always add NOPASSWD on the admin group in /etc/sudoers.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/03/quickly-creating-a-new-admin-user-on-ubuntu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Password hashing revisited</title>
		<link>http://arr.gr/blog/2012/02/password-hashing-revisited/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=password-hashing-revisited</link>
		<comments>http://arr.gr/blog/2012/02/password-hashing-revisited/#comments</comments>
		<pubDate>Fri, 24 Feb 2012 13:49:14 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Linux & FOSS]]></category>
		<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[bcrypt]]></category>
		<category><![CDATA[crypt]]></category>
		<category><![CDATA[hasing]]></category>
		<category><![CDATA[passwords]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=232</guid>
		<description><![CDATA[From user comments on my recent password hashing post, I&#8217;ve learned about a better solution for password hashing &#8211; rather than using hashing algorithms designed to be fast such as SHA-1 and SHA-256, use slower, and more important future-adaptable algorithms such as bcrypt. I have to say this is one of the reasons I love [...]]]></description>
				<content:encoded><![CDATA[<p>From user comments on my recent <a title="Storing Passwords the Right Way" href="http://arr.gr/blog/2012/01/storing-passwords-the-right-way/">password hashing post</a>, I&#8217;ve learned about a better solution for password hashing &#8211; rather than using hashing algorithms designed to be fast such as SHA-1 and SHA-256, use slower, and more important future-adaptable algorithms such as <a href="http://en.wikipedia.org/wiki/Bcrypt">bcrypt</a>. I have to say this is one of the reasons I love this community &#8211; you always learn new things.</p>
<p>I won&#8217;t repeat the reasons why methods such as bcrypt are preferred (read the comments on the previous post to learn why). However, I will note that starting from PHP 5.3 bcrypt is in fact built-in to PHP &#8211; so if you do not require portability to older versions of PHP, bcrypt-hasing could be done very easily, using the useful but a bit enygmatic <a href="http://php.net/crypt"><em>crypt</em></a> function:</p>
<p><span id="more-232"></span></p>
<pre class="brush: php; title: ; notranslate">
/**
 * Hash a string using bcrypt with specified complexity
 *
 * @param  string $password input string
 * @param  integer $complexity bcrypt exponential cost
 * @return string
 */
function bcrypt_hash($password, $complexity = 12)
{
  if ($complexity &lt; 4 || $complexity &gt; 31) {
    throw new InvalidArgumentException(&quot;BCrypt complexity must be between 4 and 31&quot;);
  }

  // CRYPT_BLOWFISH salts must be 22 alphanumeric characters long
  $random = get_random_alnum_salt(22);

  // The crypt function decides which algorithm to use (we need Blowfish) based on
  // the format of the salt parameter
  $salt = sprintf('$2a$%02d$%s', $complexity, $random);

  return crypt($password, $salt);
}

/**
 * This generates a random alphanumeric string of length $length.
 *
 * This may not be a cryptographic grade random string generation
 * function - but it is good enough for our example
 *
 * @param  integer $length random string length
 * @return string
 */
function get_random_alnum_salt($length)
{
  static $chars = null;
  if (! $chars) {
    $chars = implode('', array_merge(range('a', 'z'), range('A', 'Z'), range(0, 9)));
  }

  $salt = '';
  for ($i = 0; $i &lt; $length; $i++) {
    $salt .= $chars[mt_rand(0, 61)];
  }
  return $salt;
}

/**
 * Check a password against a hashed string
 *
 * @param  string $password cleartext password
 * @param  string $hashed bcrypt format hashed string
 * @return boolean
 */
function bcrypt_check_hash($password, $hashed)
{
  // Do some quick validation that $hashed is indeed a bcrypt hash
  if (strlen($hashed) != 60 || ! preg_match('/^\$2a\$\d{2}\$/', $hashed)) {
    throw new InvalidArgumentException(&quot;Provided hash is not a bcrypt string hash&quot;);
  }
  return (crypt($password, $hashed) === $hashed);
}
</pre>
<p>As you can see, the code is pretty simple &#8211; hashing is done internally using the <em>crypt</em> function. It is interesting to note that the returned hash string is of a very particular format &#8211; it already contains the complexity and salt parts. So comparing hashed string to cleartext originals is also easy and done nicely by <em>crypt &#8211; </em>you give it the hashed string as the salt parameter, and it takes care to only take the complexity and salt prefix of the entire hashed string into account. If the result is the same string (after all the same complexity and salt were used), you will get the exact same hashed string back. Nice!</p>
<p>It&#8217;s important to note that <em>bcrypt</em> is <strong>intentionally slow</strong> &#8211; on my laptop it hashes a medium-sized password with cost factor 12 at about 0.3 seconds. It means this is a very good solution for password hashing (because you only do it once in a while at user registration and login but not in the critical path of any frequently repeating action), but not so useful for general purpose string hashing &#8211; for such uses, you can still use MD5, SHA-1 and friends.</p>
<p>Also keep in mind that increasing the cost factor increases the time it takes to hash string dramatically, so as computing cost is reduced you can simply increase the cost factor to ensure you are future-proof.</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/02/password-hashing-revisited/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Generating ZF Autoloader Classmaps with Phing</title>
		<link>http://arr.gr/blog/2012/02/generating-zf-autoloader-classmaps-with-phing/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=generating-zf-autoloader-classmaps-with-phing</link>
		<comments>http://arr.gr/blog/2012/02/generating-zf-autoloader-classmaps-with-phing/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 09:08:41 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Linux & FOSS]]></category>
		<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[phing]]></category>
		<category><![CDATA[shoppimon]]></category>
		<category><![CDATA[zend framework]]></category>
		<category><![CDATA[zf2]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=223</guid>
		<description><![CDATA[One of the things I&#8217;ve quickly discovered when working on Shoppimon is that we need a build process for our PHP frontend app. While the PHP files themselves do not require any traditional &#8220;build&#8221; step such as processing or compilation, there are a lot of other tasks that need to happen when taking a version [...]]]></description>
				<content:encoded><![CDATA[<p>One of the things I&#8217;ve quickly discovered when working on <a href="http://www.shoppimon.com/">Shoppimon</a> is that we need a build process for our PHP frontend app. While the PHP files themselves do not require any traditional &#8220;build&#8221; step such as processing or compilation, there are a lot of other tasks that need to happen when taking a version from the development environment to staging and to production: among other things, our build process picks up and packages only the files needed by the app (leaving out things like documentation, unit tests and local configuration overrides), minifies and pre-compresses CSS and JavaScript files,  and performs other helpful optimizations on the app, making it ready for production.</p>
<p>Since Shoppimon is based on Zend Framework 2.0, it also heavily relies on the ZF2.0 autoloader stack. Class autoloading is convenient, and was shown to greatly improve performance over using <em>require_once </em>calls. However, different autoloading strategies have pros and cons: while PSR-0 based autoloading (the so called Standard Autoloader from ZF1 days) works automatically and doesn&#8217;t require updating any mapping code for each new class added or renamed, it has a significant performance impact compared to classmap based autoloading.</p>
<p>Fortunately, using ZF2&#8242;s autoloader stack and Phing, we can enjoy both worlds: while in development, standard PSR-0 autoloading is used and the developer can work smoothly without worrying about updating class maps. As we push code towards production, our build system takes care of updating class map files, ensuring super-fast autoloading in production using the ClassMapAutoloader. How is this done? Read on to learn.</p>
<p><span id="more-223"></span></p>
<p>Before auto-generating class map files, you need to make sure your code supports both using a class map based autoloader, and falling back to the standard autoloader. This is how it is done in the main Shoppimon bootstrap file:</p>
<pre class="brush: php; title: ; notranslate">
// Set up the autoloader
require_once 'Zend/Loader/AutoloaderFactory.php';
Zend\Loader\AutoloaderFactory::factory(array(
  'Zend\Loader\ClassMapAutoloader' =&gt; array(
    include 'library/Shoplift/autoload_classmap.php'
  ),
  'Zend\Loader\StandardAutoloader' =&gt; array(
    'namespaces' =&gt; array(
      'Shoplift' =&gt; realpath('library/Shoplift')
    )
  ),
));
</pre>
<p>Note that &#8216;Shoplift&#8217; is the namespace for our in-house library used in addition to Zend Framework 2.0. We have similar code in the Module.php file of each one of our modules:</p>
<pre class="brush: php; title: ; notranslate">
public function getAutoloaderConfig()
{
  return array(
    'Zend\Loader\ClassmapAutoloader' =&gt; array(
      __DIR__ . '/autoload_classmap.php'
    ),
    'Zend\Loader\StandardAutoloader' =&gt; array(
      'namespaces' =&gt; array(
        __NAMESPACE__ =&gt; __DIR__ . '/src'
      )
    ),
  );
}
</pre>
<p>Both code segments reference a class map file named autoload_classmap.php &#8211; these exists for our common library and for each one of our modules &#8211; and this is what they look like:</p>
<pre class="brush: php; title: ; notranslate">
&lt;?php

/**
 * Autoloader classmap
 *
 * Keep the classmap empty - it is auto-generated at build time
 */

return array();
</pre>
<p>Yes &#8211; an empty classmap array is returned. This is how this file is committed to SCM, and we even add it to our .gitignore file to ensure it is not mistakenly overwritten with actual data. Returning an empty class map ensures that the first ClassMapAutoloader always fails, and that the StandardAutoloader is always used. That is, unless a classmap file is properly generated at build time.</p>
<p>We&#8217;ve decided to use <a href="http://www.phing.info/trac/">Phing</a> for building our PHP apps &#8211; it wasn&#8217;t a hard decision, and was based on the fact that Phing is well known and widely used, it is very similar in concepts and syntax to Ant which we were already using for some of our non-PHP stuff (yes, we have some Java and Python as well, not everything is PHP you know&#8230;) and, unlike Ant, we can easily extend it based on our needs using PHP &#8211; which is exactly what we&#8217;ve done in order to ensure class maps are populated at build time.</p>
<p>Here is a segment from our Phing build.xml file responsible for generating class maps:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;target name=&quot;generate-classmap&quot; depends=&quot;copyfiles&quot;&gt;
  &lt;taskdef classname=&quot;phing.tasks.ZfClassmapTask&quot; name=&quot;classmap&quot; /&gt;
  &lt;classmap outputFile=&quot;${builddir}/data/library/Shoplift/autoload_classmap.php&quot;&gt;
    &lt;dirset dir=&quot;${builddir}/data/library&quot;&gt;
      &lt;include name=&quot;Shoplift&quot; /&gt;
    &lt;/dirset&gt;
  &lt;/classmap&gt;
  &lt;classmap outputFile=&quot;${builddir}/data/module/Frontend/autoload_classmap.php&quot;&gt;
    &lt;dirset dir=&quot;${builddir}/data/module&quot; includes=&quot;Frontend&quot; /&gt;
  &lt;/classmap&gt;
  &lt;classmap outputFile=&quot;${builddir}/data/module/InternalApi/autoload_classmap.php&quot;&gt;
    &lt;dirset dir=&quot;${builddir}/data/module&quot; includes=&quot;InternalApi&quot; /&gt;
  &lt;/classmap&gt;
&lt;/target&gt;
</pre>
<p>The &#8220;generate-classmap&#8221; build target takes a set of files (previously copied to a working directory in the &#8220;copyfiles&#8221; target), scans them and generates several classmap files &#8211; one for the library/Shoplift directory, and one for each one of our modules &#8211; &#8220;Frontend&#8221; and &#8220;InternalApi&#8221;. This is done after defining a new task type: the &#8220;classmap&#8221; task, defined in the phing/tasks/ZfClassmapTask.php file, which looks like this (this is where the magic is):</p>
<p><script src="https://gist.github.com/1590399.js?file=ZfClassmapTask.php"></script></p>
<p>We place this file in the &#8216;phing/tasks&#8217; directory under our source tree, but it can actually be placed anywhere in the include path (see comment inside the file) as long as the <em>classname=&#8221;&#8230;&#8221;</em> attribute of the <em>&lt;taskdef&gt;</em> element is adjusted accordingly.</p>
<p>This is a Phing task class, and was built by adapting the classmap_generator.php tool included in ZF2 to Phing. It scans the directories mentioned in the <em>&lt;dirset&gt;</em> elements, finds PHP classes, and adds them to the generated class map file. This is done automatically at build time, takes a couple of seconds, and allows us to enjoy uninterrupted development as well as good production performance.</p>
<p>BTW as you can see the Phing class was posted as a Gist in Github &#8211; so feel free to fork it, fix it and improve it &#8211; and please share the results.</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/02/generating-zf-autoloader-classmaps-with-phing/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Say Hi to Shoppimon &#8211; Magento Monitoring for &#8220;Normal&#8221; People</title>
		<link>http://arr.gr/blog/2012/02/say-hi-to-shoppimon-e-commerce-monitoring-for-normal-people/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=say-hi-to-shoppimon-e-commerce-monitoring-for-normal-people</link>
		<comments>http://arr.gr/blog/2012/02/say-hi-to-shoppimon-e-commerce-monitoring-for-normal-people/#comments</comments>
		<pubDate>Tue, 07 Feb 2012 10:49:52 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Thoughts & Possibilities]]></category>
		<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[shoppimon]]></category>
		<category><![CDATA[zend framework]]></category>
		<category><![CDATA[zf2]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=213</guid>
		<description><![CDATA[For a while now I have been telling people I am &#8220;working on a small project&#8221; &#8211; and now is the time to unveil the mystery and introduce Shoppimon &#8211; a new start-up which I founded together with a small group of friends, and am currently spending most of my time around. The idea of [...]]]></description>
				<content:encoded><![CDATA[<p>For a while now I have been telling people I am &#8220;working on a small project&#8221; &#8211; and now is the time to unveil the mystery and introduce <a href="http://www.shoppimon.com/">Shoppimon</a> &#8211; a new start-up which I founded together with a small group of friends, and am currently spending most of my time around.</p>
<p>The idea of Shoppimon is simple &#8211; we want to provide Web monitoring and availability analysis which will be useable by, and useful to &#8220;normal&#8221; people &#8211; not only the tech guy, the programmer or the IT specialist, but the site owner, the business owner or even the marketing guy &#8211; in other words the real stake holder.</p>
<p><span id="more-213"></span></p>
<p>Shoppimon focuses on synthetic monitoring &#8211; it simulates real users going through the store and logs their &#8220;experience&#8221; &#8211; any errors encountered, overall time to complete certain actions, etc. &#8211; as such it complements real-user monitoring that products like Zend Server provide). In comparison to existing synthetic Web monitoring services, we want Shoppimon to be a snatch to get started with, an inexpensive solution that would be useful to even the smallest commercial site owners, and most important using it should not require one to be a tech savvy person.</p>
<p>Shoppimon is focusing on <a href="http://www.magentocommerce.com/">Magento</a> &#8211; a popular PHP / Zend Framework based eCommerce solution. Magento has proved to be a popular eCommerce solution and has grown a rich ecosystem of developers and service providers around it, and has a rich community of users. But as most PHP programmers know, managing Magento&#8217;s availability and performance isn&#8217;t easy. It&#8217;s a heavy application, one of the heaviest PHP apps ever built, and the codebase is somewhat complex. Our goal is to help Magento store owners find (and fix) problems in their stores, and show them how their store compares to others.</p>
<p>This is why we decided to focus Shoppimon on Magento and provide objective scientific data (which we present in an easy-to-digest way) that would help Magento store owners (and in turn their developers and hosting providers) to enjoy smooth sailing.</p>
<p>Technically, Shoppimon is pretty interesting (hey, otherwise I wouldn&#8217;t have done it) &#8211; it has parts written in different programming languages. It is entirely based on Cloud technologies. The front-end runs on <a href="http://framework.zend.com/">Zend Framework 2.0</a> (it is possibly the first commercial app to be out there running on ZF 2.0) and on <a href="http://www.zend.com/en/products/server/">Zend Server</a>. The backend parts mix PHP, Java and Python (a long story&#8230;) and uses some very interesting technology to simulate real shoppers browsing around on the Magento sites we test.</p>
<p>In the next few weeks I&#8217;ll probably be posting more about how stuff work at Shoppimon under the hood. Stay tuned!</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/02/say-hi-to-shoppimon-e-commerce-monitoring-for-normal-people/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Storing Passwords the Right Way</title>
		<link>http://arr.gr/blog/2012/01/storing-passwords-the-right-way/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=storing-passwords-the-right-way</link>
		<comments>http://arr.gr/blog/2012/01/storing-passwords-the-right-way/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 06:04:47 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[101]]></category>
		<category><![CDATA[hashing]]></category>
		<category><![CDATA[md5]]></category>
		<category><![CDATA[passwords]]></category>
		<category><![CDATA[salt]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[sha1]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=188</guid>
		<description><![CDATA[I consider this post a bit of an experiment in writing about what I consider &#8220;beginner&#8221; material. Not that it is necessarily simple or easy stuff anyone should know, but simply because this is not a &#8220;new discovery&#8221; as far as I am concerned. Also, I usually try not to write about security related material, [...]]]></description>
				<content:encoded><![CDATA[<p>I consider this post a bit of an experiment in writing about what I consider &#8220;beginner&#8221; material. Not that it is necessarily simple or easy stuff anyone should know, but simply because this is not a &#8220;new discovery&#8221; as far as I am concerned. Also, I usually try not to write about security related material, as I do not consider myself a security expert. However, since I&#8217;m starting to teach a &#8220;PHP 101&#8243; course soon (maybe I&#8217;ll post more about it in the next few weeks), and since I was asked a few times about this topic recently, I&#8217;ve decided to write up my experience on this topic and test the reactions.</p>
<p>So, the topic in question is &#8220;what is the right way to store user passwords in my DB&#8221;. To be clear, I am talking specifically about the passwords users will use to log in to your application, not some 3rd party password you need to store for whatever reason. This is something almost any application out there requires &#8211; unless you interface with some external authentication mechanism (OAuth, openId, your office LDAP or Kerberos server), there&#8217;s a very high chance you&#8217;ll need to authenticate users against a self-stored user name and password.</p>
<p><span id="more-188"></span></p>
<p>In order to figure out what is the best solution, let&#8217;s start by going over the problems we might face if we simply take the naive approach and store passwords in their original, clear text form:</p>
<ul>
<li>If our database gets hacked (for example if we are exposed to an SQL injection attack through some 3rd party app we have installed on our server), passwords could get stolen. Clear-text passwords could easily be used to hijack our users&#8217; accounts. In many cases, even if we ignore the risk of a hacked database &#8211; a clear-text password can be stolen by a disgruntled worker with access to the DB.</li>
<li>Moreover, users tend to use the same passwords for all sorts of different services. If I know someone&#8217;s password to one service, chances are I can use the same password or a similar one to impersonate that user in other sites as well. Most users do not consider the fact (but you should!) that when they type in a password for your silly site with funny kitten pictures, there&#8217;s a good chance they are entrusting the password to their bank or PayPal account in your hands.</li>
</ul>
<p>Given the points above, it is our responsibility as web developers to ensure that nobody, not even us, could get a clear-text version of the user&#8217;s password. So what&#8217;s the right way to do that?</p>
<h2>Best solution: avoid the problem all together</h2>
<p>The best way is, of course, not to store passwords at all. If functional and business terms allow, you should consider using some other authentication mechanism like OAuth or OpenID &#8211; basically let someone else worry about storing sensitive data. Many users have Google or Facebook profiles, and you can find several examples of pretty big websites that simply let users authenticate through these providers instead of through their own identity management service. This is an elegant solution, but clearly it does not work in all cases. If you still need to store passwords, read on.</p>
<h2>Password Hashing</h2>
<p>One aspect of passwords is that really, you have no use for their actual value. A password can be anything, and as long as the user repeats the same initial password when asked to sign in, you do not care what the password&#8217;s textual value is. This is important, because it means we don&#8217;t need to store the original value of the password: it is safe to store a product of the password, and as long as we know how to reproduce this product from a typed-in password, we can compare the products and not the password itself.</p>
<p>Sounds confusing? Here is a simple example: assume that instead of passwords, our site uses numbers to authenticate users. Each user has to provide a user name and a random number as a password. Before we store this &#8220;password&#8221; in our DB, we pass it through this simple mathematical function:</p>
<pre>    f(x) = x * 5</pre>
<p>So if a user types in &#8220;4709&#8243; as a password, we multiply that number by 5 and store the value &#8220;23545&#8243; in our DB. When the user attempts to sign in again, we pass the value typed in as password through the same function. If we get &#8220;23545&#8243; as the product of the typed in password passed through our function, we know the user typed in the right password.</p>
<p>This is good, because now the value stored in our DB is not the actual password, but an obscure value (well, sort of). If someone steals our DB, they can type in &#8220;23545&#8243; into the sign-in form all day &#8211; but that&#8217;s not the right password (remember, we multiply whatever is typed in by 5 before comparing it!).</p>
<p>Unfortunately, things are not that easy. First, it&#8217;s enough for someone to know that we multiply numbers by 5 to obfuscate them, and they can easily reverse engineer passwords by dividing the data stored in the DB by 5 &#8211; our function is <strong>reversible</strong>. Second, a smart enough hacker looking at our list of stored password values would probably notice that all of them are multiples of 5 &#8211; so even without knowing what we do in advance, reverse engineering our &#8220;security&#8221; method is quite easy.</p>
<p>As it turns out, the approach is right but the function we are using is too simple. What we need is a function that:</p>
<ul>
<li>Is irreversible, or at least is impossible to reverse in the real world. In other words, even if one knows both the result and the function used, computing the input value will be impossible or impractical.</li>
<li>Is indeed a function in the mathematical sense &#8211; that is given the same input value, only a single outcome is possible, and the same outcome is always produced for the same input. For example, a computer function which multiplies the input value by a random number is no good for us.</li>
<li>Produces a distinct, single value for each input value &#8211; or at least provides a very low risk of producing the same result for two input values. This is important because we want to make sure that <strong>only</strong> one typed-in password matches the obscured value stored in the DB.</li>
</ul>
<p>Of course, most of us are not mathematicians &#8211; so you&#8217;d be happy to know that  such functions exist, and are usually referred to as &#8220;hash functions&#8221;. Hash functions use used extensively by programmers for all sorts of uses, and cryptography is most definitely one of them. A few popular and useful cryptographic-grade hash functions are <a href="http://en.wikipedia.org/wiki/MD5">MD5</a>, <a href="http://en.wikipedia.org/wiki/SHA-1">SHA-1</a> and <a href="http://en.wikipedia.org/wiki/SHA-2">SHA-2</a>. We will not go into the mathematical definitions of these functions (I have very little knowledge of how these functions actually work!) &#8211; it&#8217;s enough to say pretty much every popular programming language  out there has at least one implementation of these functions.</p>
<p>As an example, the MD5 function produces a 128-bit &#8220;hash value&#8221; for an input value provided to it. The hash is always 128 bit long, regardless of the input size. It will always produce the same hash value for the same input. The most successful known collision attack (that is an attack producing the same hash value for a modified input value) on MD5 took about 2 million execution attempts which makes it quite bad for validating SSL certificates (which it used to be used for), but still sort of Ok (although not great) for password hashing.</p>
<p>To compute an MD5 hash value in PHP, you can do the following:</p>
<pre>    php &gt; echo md5("my name is Inigo Montoya");
    d9937edae7d26a399d41dda16f137e42</pre>
<p>As you can see the MD5 value of the string &#8220;my name is Inigo Montoya&#8221; is &#8220;d9937edae7d26a399d41dda16f137e42&#8243; (this is in fact a hexadecimal representation of the MD5 value, which is a 128 bit number &#8211; this is the standard way to present hash values of various functions). On the other hand:</p>
<pre>    php &gt; echo md5("my name is In<strong>d</strong>igo Montoya");
    ae7cd5e68c73f9f44df66030cc9d1c06</pre>
<p>Even a slight change in the input text produces a completely different MD5 hash.</p>
<h2>It&#8217;s time to stop using MD5 for cryptographic purposes</h2>
<p>While MD5 was the de-facto standard for storing hashed passwords for some time, it is now becoming clear that it may not be suitable for cryptographic purposes (it is definitely suitable for other things). In 2009 it was shown that producing collisions for MD5 can be done within seconds or minutes on commodity hardware &#8211; this does not mean it is easy to reverse engineer password values stored in your DB as MD5 hash values, but it does mean that if a highly skilled hacker wants to specifically target your site, they have a better chance of succeeding in doing so. In addition, it is safe to assume additional vulnerabilities will be detected in the future.</p>
<p>Unless you are somehow limited (not if you&#8217;re a PHP developer!), switching to stronger hash functions such as SHA-1 or SHA-256 is highly recommended.</p>
<p>Throughout the rest of this article I will use SHA-1 in examples. SHA-1 is not collisions free, but so far the best known <strong>theoretical</strong> collision attack on SHA-1 took 2 to the power of 51 attempts to perform (that&#8217;s a number with 16 digits!), and until now nobody has been able to show an actual successful attempt to do so. SHA-1 produces a 160-bit digest values, and can be computed in PHP like so:</p>
<pre>    php &gt; echo sha1("my name is Inigo Montoya");
    b208946a9c3c4b26a4d6bb87c3f630f996146ee</pre>
<h2>So, I should just store the password hash in the DB?</h2>
<p>Well, yes and no.</p>
<p>Yes &#8211; because that&#8217;s the first step. By storing a SHA-1 hashed version of the password in your DB you ensure nobody can <strong>compute</strong> the password by simply stealing your users DB table data. When a user types in their password, you compare the stored hash to the SHA-1 hash of the typed in string, and if they match, you grant access. Simple and effective.</p>
<p>But wait&#8230; that&#8217;s still not good enough.</p>
<h2>Stupid but Effective: Dictionary Attacks</h2>
<p>All hash functions are vulnerable to a type of attack sometimes referred to as <strong>rainbow attacks </strong>or <strong>dictionary attacks</strong>.</p>
<p>These attacks take advantage of the fact that in most cases, humans are humans &#8211; and the passwords they use are of limited size (how many people can you think of that use 12 or even 10 character long passwords?) and are composed of a limited set of characters (remember that even power users that use punctuation, numbers and mixed-case characters in their passwords are still confined to the ~75 characters or so on their keyboards).</p>
<p>Dictionary attacks are stupid but effective: the idea is to create a dictionary (basically a big key -&gt; value table) of predictable passwords (dictionary words, expected combinations of key strokes, all permutations of what&#8217;s on your keyboard up to 8 characters long) and their MD5 or SHA-1 values. Once such a table exists (creating it may take several hours on commodity hardware, but this is a one-time effort), you can search for an original password using it&#8217;s hash value.</p>
<p>A dictionary attack allows me to reverse-engineer the original password from it&#8217;s hash value not by smart computation (which, given a good hash function, is impossible or impractical), but through a simple query to a ready-made &#8220;dictionary&#8221; mapping hash values to original strings.</p>
<p>But it&#8217;s even easier than that: nowdays there are <a href="http://md5.rednoize.com/">services</a> that offer such dictionary lookup in their existing databases. It&#8217;s not even required to do the work of building the dictionary.</p>
<p>One good solution to dictionary attacks is forcing your users to mix punctuation, upper and lower-case characters and numbers in their at-least 12 character long passwords. However, we all know that in many cases this means expecting too much from your users.</p>
<p>The practical solution to dictionary attacks is quite simple, and is called <strong>salting</strong>.</p>
<h2>Just Add Salt</h2>
<p>Salting is a simple yet effective method to improve the security of stored passwords and prevent dictionary attacks. The idea is that instead of expecting a long, random password from the user, you take whatever password the user provides and add additional random noise (referred to as &#8220;salt&#8221;) to it yourself. You store that random noise next to the password, and use it to compute the hash when checking passwords.</p>
<p>Once a long enough and random enough salt is added, comparing hash values stored in your DB to a dictionary becomes very hard: an attacker will need to build an entire database of hash values for each different salt + password combination, effectively requiring the creation of a table with hundreds of billions of records to crack a single password.</p>
<p>Make sure a different random salt is added to each password: otherwise a single DB of salted hash values can be created for your application &#8211; it won&#8217;t be useful for other apps, but if someone wants to target your app they can definitely achieve their goals.</p>
<p>As an example, let&#8217;s assume a user who&#8217;s password is &#8216;inigo2001&#8242;. Here is how this user&#8217;s password will be stored in the DB without salting:</p>
<pre> +-------------------+------------------------------------------+
 | user              | password_hash                            |
 +-------------------+------------------------------------------+
 | inigo@montoya.com | e40900c950cc6011297b2b392b42c29688b33ac7 |
 +-------------------+------------------------------------------+</pre>
<p>An attacker with a good dictionary can figure out that the password_hash value is in fact the SHA-1 digest of &#8220;inigo2001&#8243;. However, if we add salt:</p>
<pre> +-------------------+------------------------------------------+------------------------------+
 | user              | password_hash                            | password_salt                |
 +-------------------+------------------------------------------+------------------------------+
 | inigo@montoya.com | fe66f3eb9c0afc8c935dc9f3f26dbea68d48ccc1 | 9ljYI+xMaVOSloDwt9ahzTpqMHA= |
 +-------------------+------------------------------------------+------------------------------+</pre>
<p>Guessing that password_hash is the SHA-1 digest of &#8220;indigo20019ljYI+xMaVOSloDwt9ahzTpqMHA=&#8221; is quite hard &#8211; one would need to build a huge dictionary just to figure out this one password, assuming they also have insight into our code and have figured out that we have concatenated the password_salt value after the original password value and passed that through SHA-1.</p>
<p>Note that the password_salt value in this case is a base-64 encoded string of 20 random bytes &#8211; using a random enough and long enough salt is important, otherwise there&#8217;s a good chance your password + salt value happens to already exist in the attacker&#8217;s DB.</p>
<h2>Example Time</h2>
<p>To summarize things, here is an actual example of a few PHP functions that store user information in the DB in a secure manner and verify passwords against that stored information.</p>
<p>Our users table in the database is assumed to look something like:</p>
<pre>mysql&gt; DESCRIBE users;
+---------------+---------------------+------+-----+---------+----------------+
| Field         | Type                | Null | Key | Default | Extra          |
+---------------+---------------------+------+-----+---------+----------------+
| id            | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| email         | varchar(50)         | NO   | UNI | NULL    |                |
| password      | char(40)            | NO   |     | NULL    |                |
| password_salt | binary(16)          | YES  |     | NULL    |                |
+---------------+---------------------+------+-----+---------+----------------+</pre>
<p>The password field is a 40 byte long CHAR (SHA-1 hashes in hexadecimal representation are always 40 byte long). The password_salt field is a 16 byte BINARY field &#8211; it will contain some random bytes with no particular encoding so it shouldn&#8217;t be a CHAR or VARCHAR field.</p>
<p>As new users register, the following functions are used to set the user&#8217;s password in the DB:</p>
<pre class="brush: php; title: ; notranslate">
class User
{
  /**
   * This will contain a hashed version of the user's password
   *
   * @var string
   */
  protected $password = null;

  /**
   * This will contain the salt value used to add noise to the password hash
   *
   * @var string
   */
  protected $password_salt = null;

  /**
   * Set the user's password
   *
   * @param string $password
   */
  public function setPassword($password)
  {
    // Test that password is at least 6 characters mixing letters and digits
    if (! preg_match('/^.*(?=.{6,})(?=.*[a-z])(?=.*[A-Z])(?=.*\d).*$/')) {
        throw new \ErrorException(&quot;Password is not strong enough&quot;);
    }

    $this-&gt;password_salt = $this-&gt;generateRandomSalt();
    $this-&gt;password = sha1($password . $this-&gt;password_salt);
  }

  /**
   * Generate a random salt value, 16 bytes long
   *
   * This relies on OpenSSL being available. If it is not available, any
   * cryptographic-grade random string generation function would work. On
   * UNIX machines, you can just read 16 bytes from /dev/urandom.
   *
   * @return string
   */
  protected function generateRandomSalt()
  {
    return openssl_random_pseudo_bytes(16);
  }
}
</pre>
<p>To check a given password, we add the following function to the same class (assume that the protected values are populated from values fetched from the DB):</p>
<pre class="brush: php; title: ; notranslate">
  /**
   * Check if password is correct
   *
   * @param  string $password
   * @return boolean
   */
  public function checkPassword($password)
  {
    $hashed = sha1($password . $this-&gt;password_salt);
    return ($hashed === $this-&gt;password);
  }
</pre>
<p>As you can see, this class (assuming a working database access layer) will do the work of properly salting and hashing passwords before saving them, and of comparing given clear-text passwords to a salted, hashed value stored in the DB. This practically ensures stealing passwords from you is near impossible.</p>
<h2>What&#8217;s next?</h2>
<p>I hope that this article pointed out some good practices in storing passwords in the DB. It is important to remember that while your site may not be very interesting to hack into, hacking into your users&#8217; accounts could be a first step towards identity theft or the hijacking of accounts on another site holding much more sensitive data. Implementing the measures described here would mean you are at least treating your users&#8217; password with the right care.</p>
<p>There are additional aspects to password security which you should look into: using proper security on the transport channel when asking for passwords (HTTPS with a valid certificate), requiring strong enough passwords from your users, avoiding session fixation attacks (<a href="http://php.net/session_regenerate_id">session_regenerate_id</a> at login) and more. I also did not touch procedures of replacing lost passwords, which are also a common weak point vulnerable for phishing attacks. There is quite a lot of material to read out there on these topics, and if I see an interest is raised I might cover some of them myself in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2012/01/storing-passwords-the-right-way/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>My PHP Streams API article was published by php&#124;architect</title>
		<link>http://arr.gr/blog/2011/12/my-php-streams-api-article-was-published-by-phparchitect/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=my-php-streams-api-article-was-published-by-phparchitect</link>
		<comments>http://arr.gr/blog/2011/12/my-php-streams-api-article-was-published-by-phparchitect/#comments</comments>
		<pubDate>Sat, 31 Dec 2011 10:30:14 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[PHP & Web Technologies]]></category>
		<category><![CDATA[article]]></category>
		<category><![CDATA[magazine]]></category>
		<category><![CDATA[php|architect]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=184</guid>
		<description><![CDATA[php&#124;architect, one of the most prominent professional PHP magazines in the world, has published an article I wrote about PHP&#8217;s user-space Streams API in its December 2011 issue: Go with the Flow: PHP’s Userspace Streams API Almost every PHP application out there needs to read data from files or write data to files – or [...]]]></description>
				<content:encoded><![CDATA[<p>php|architect, one of the most prominent professional PHP magazines in the world, has published an article I wrote about PHP&#8217;s user-space Streams API in its <a href="http://www.phparch.com/magazine/2011-2/december/">December 2011</a> issue:</p>
<blockquote>
<h2>Go with the Flow: PHP’s Userspace Streams API</h2>
<div>Almost every PHP application out there needs to read data from files or write data to files – or things that look like files but are not quite files – these unstructured blobs of data are commonly referred to as “streams”. Stream functions allow a scalable, portable and memory efficient way to handle data, and pretty much any PHP developer out there knows how to read data from or write data to a steam. The best part is that you don’t have to be an extension author in order to provide access to any data source as if it was just a regular file. PHP’s userspace streams API allows you do to exactly that, and this article will show you how.</div>
</blockquote>
<div>If you&#8217;re a subscriber, feel free to read the article and send me your feedback. If not, go ahead an buy the issue <img src='http://arr.gr/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </div>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2011/12/my-php-streams-api-article-was-published-by-phparchitect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replacing a lost SSH key on an Amazon EC2 machine</title>
		<link>http://arr.gr/blog/2011/11/replacing-a-lost-ssh-key-on-an-amazon-ec2-machine/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=replacing-a-lost-ssh-key-on-an-amazon-ec2-machine</link>
		<comments>http://arr.gr/blog/2011/11/replacing-a-lost-ssh-key-on-an-amazon-ec2-machine/#comments</comments>
		<pubDate>Wed, 23 Nov 2011 18:18:11 +0000</pubDate>
		<dc:creator>shahar</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Linux & FOSS]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[keypair]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[oops]]></category>
		<category><![CDATA[ssh]]></category>
		<category><![CDATA[tip]]></category>

		<guid isPermaLink="false">http://arr.gr/blog/?p=180</guid>
		<description><![CDATA[Due to an unfortunate shmelting accident (read: poor backup practices), I lost the SSH private key granting me the only way to access one of my EC2 hosted servers. Being unable to access the server, and unable to easily set a new public key through Amazon&#8217;s interfaces, I panicked for a few seconds. Then I [...]]]></description>
				<content:encoded><![CDATA[<p>Due to <a href="http://www.youtube.com/watch?v=sr0gNJ090JA" target="_blank">an unfortunate shmelting accident</a> (read: poor backup practices), I lost the SSH private key granting me the only way to access one of my EC2 hosted servers. Being unable to access the server, and unable to easily set a new public key through Amazon&#8217;s interfaces, I panicked for a few seconds. Then I started trying to hack my way in, and eventually found a way to set a new public key to my user. Here is what I did.</p>
<p>First, know that I was lucky: for this method to properly work, you need a few things:</p>
<ul>
<li>The machine must be EBS based</li>
<li>You need to be able to afford a couple of minutes of downtime</li>
<li>You need to be able to withstand the effects of restarting the machine &#8211; for example, if you do not have an Elastic IP address associated with the machine, its public address will change. In some situations this is not acceptable.</li>
</ul>
<p>After trying some different approaches, what worked for me was to do the following:</p>
<ol>
<li>Generate a new keypair for yourself, and import the public key to your EC2 account</li>
<li>Start a new, clean, cheap machine (this will only be needed to do very simple things, so I recommend using a <em>tiny</em> machine) in the same availability zone as the affected machine</li>
<li>Stop the affected machine (do not terminate, STOP it &#8211; this is only possible with EBS machines)</li>
<li>Detach the root device from the affected machine (by default attached as /dev/sda1)</li>
<li>Attach the detached device to the new clean machine</li>
<li>SSH into the clean machine and mount the affected machine&#8217;s root filesystem somewhere (e.g. in /mnt/fs)</li>
<li>Now you can edit /mnt/fs/root/.ssh/authorized_keys (or on official Ubuntu machines /home/ubuntu/.ssh/authorized_keys) and add your new public key to it</li>
<li>Unmount the volume and terminate the clean machine &#8211; you no longer need it</li>
<li>Re-attach the root device to the affected machine (which should be stopped) &#8211; ensure to attach it as the same device it was before (e.g. /dev/sda1)</li>
<li>Re-start your old machine &#8211; you should now be able to use your new key!</li>
</ol>
<p>Another approach which could work but I gave up on after a couple of attempts (I think it really depends on the init scripts in the machine you are using), is to stop the machine and change the User Data of it to a shell script that sets a new public key in the right place, then start it again.</p>
<p>And really, you should backup your keys!</p>
]]></content:encoded>
			<wfw:commentRss>http://arr.gr/blog/2011/11/replacing-a-lost-ssh-key-on-an-amazon-ec2-machine/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
