PHP Memory Usage & Unnecessary String Concatenation

As PHP developers, especially if like me you don’t come from hard-core Comp Sci background, we are initially trained not to worry about memory. We do not allocate it, do not release it – in fact we rarely even worry about closing files and DB connections, and we hardly ever care, as many C programmers would, about the real-memory size of the different variable types we use. That’s a good thing too – in Web environments, these things tend to be negligible and putting time into optimizing them would be a wasteful micro-optimization.

However, it is also important to be mindful of the fact that most PHP servers are memory-bound. That is, a Web server running Apache or nginx and a pool of PHP processes (whether these are `php-fpm` processes or mod_php Apache forks) is most likely limited by how much memory is available to spawn more PHP processes. The number of concurrent PHP processes directly correlates to the number of concurrent requests a server can handle. On its own, each PHP process takes a few (I would guess 5-15) MBs of private memory, but very often the application code would require it to allocate many, many more MBs – just try to run your app with low memory_limit setting (the default is 128mb) and see what happens. Memory allocations have a lot of impact on PHP speed (as recent phpng benchmarks show), but speed aside, its important to remember memory hogging directly impacts your app’s hardware requirements.

For that reason, I think its a good idea to come up with a list of memory utilization good practices for PHP – we already have such “checklists” for security and for speed optimization, and I think that while micro-optimizations are usually worthless in the real world, following good practices when writing new code can save you the occasional meltdown. One good practice I’m going to suggest today is being mindful about where it is correct to use the oh-so-common operation of string concatenation.
Continue reading

Serving ZF apps with the PHP 5.4 built-in Web Server

When teaching PHP to newcomers, I have found that (honestly to my surprise) one of the biggest barriers you have to cross is setting the stack up to serve PHP files properly, especially when it comes to Zend Framework apps and other rewrite rule based MVC applications. Even with strong development background, the idea of setting up Web Server configuration to get things working seems foreign to many.

Even as an experienced developer with good knowledge of the LAMP stack setup, setting up new vhosts and other configuration for each new project is sometimes a pain in the ass.

There are of course good news – starting from PHP 5.4, the Command Line Interface (CLI) version of PHP comes with a build-in Web server that can be used to serve PHP apps in development. This Web server is very easy to use – you just fire it up in the right place and it works, serving your PHP files. While it is by no means a viable production solution (it is a sequential, no-concurrency server meaning it will only serve one request at a time), it is very convenient for development purposes.

While it “just works” for simple “1-to-1 URL <-> File” apps, it can work almost as easily for rewrite based MVC apps, including Zend Framework 1.x and 2.x apps and probably for other frameworks as well.

Continue reading

On PHP Extensions

While teaching PHP I mention the term “extension” quite a lot – but I have realized that this may very well be a confusing term for non-PHPers. While most PHP training courses focus on code, I believe getting to know the PHP engine and environment is almost as important as learning to use the language. Extensions are a big and important part of PHP, but they seem to be a big knowledge gap about them and Googling for articles that explain what extensions are, and how they are used and installed, hardly returns good results. So, as part of my attempt to blog about more basic PHP topics, I’ve decided to try and come up with an overview of PHP extensions.

So what are PHP extensions?

An extension in PHP is in fact a module providing some functionality to the PHP Engine. While the term makes it sound like extensions provide some kind of special functionality, in reality many of the language’s most basic functions and classes are provided in extensions. Many extensions are shipped as part of the default PHP distribution, and some are in fact compiled into PHP in such way that they cannot even be unloaded. Come to think of it, perhaps it is best to think of PHP extensions as “language modules”.

Continue reading

Password hashing revisited

From user comments on my recent password hashing post, I’ve learned about a better solution for password hashing – rather than using hashing algorithms designed to be fast such as SHA-1 and SHA-256, use slower, and more important future-adaptable algorithms such as bcrypt. I have to say this is one of the reasons I love this community – you always learn new things.

I won’t repeat the reasons why methods such as bcrypt are preferred (read the comments on the previous post to learn why). However, I will note that starting from PHP 5.3 bcrypt is in fact built-in to PHP – so if you do not require portability to older versions of PHP, bcrypt-hasing could be done very easily, using the useful but a bit enygmatic crypt function:

Continue reading

Generating ZF Autoloader Classmaps with Phing

One of the things I’ve quickly discovered when working on Shoppimon is that we need a build process for our PHP frontend app. While the PHP files themselves do not require any traditional “build” step such as processing or compilation, there are a lot of other tasks that need to happen when taking a version from the development environment to staging and to production: among other things, our build process picks up and packages only the files needed by the app (leaving out things like documentation, unit tests and local configuration overrides), minifies and pre-compresses CSS and JavaScript files,  and performs other helpful optimizations on the app, making it ready for production.

Since Shoppimon is based on Zend Framework 2.0, it also heavily relies on the ZF2.0 autoloader stack. Class autoloading is convenient, and was shown to greatly improve performance over using require_once calls. However, different autoloading strategies have pros and cons: while PSR-0 based autoloading (the so called Standard Autoloader from ZF1 days) works automatically and doesn’t require updating any mapping code for each new class added or renamed, it has a significant performance impact compared to classmap based autoloading.

Fortunately, using ZF2′s autoloader stack and Phing, we can enjoy both worlds: while in development, standard PSR-0 autoloading is used and the developer can work smoothly without worrying about updating class maps. As we push code towards production, our build system takes care of updating class map files, ensuring super-fast autoloading in production using the ClassMapAutoloader. How is this done? Read on to learn.

Continue reading

Say Hi to Shoppimon – Magento Monitoring for “Normal” People

For a while now I have been telling people I am “working on a small project” – and now is the time to unveil the mystery and introduce Shoppimon – a new start-up which I founded together with a small group of friends, and am currently spending most of my time around.

The idea of Shoppimon is simple – we want to provide Web monitoring and availability analysis which will be useable by, and useful to “normal” people – not only the tech guy, the programmer or the IT specialist, but the site owner, the business owner or even the marketing guy – in other words the real stake holder.

Continue reading

Storing Passwords the Right Way

I consider this post a bit of an experiment in writing about what I consider “beginner” material. Not that it is necessarily simple or easy stuff anyone should know, but simply because this is not a “new discovery” as far as I am concerned. Also, I usually try not to write about security related material, as I do not consider myself a security expert. However, since I’m starting to teach a “PHP 101″ course soon (maybe I’ll post more about it in the next few weeks), and since I was asked a few times about this topic recently, I’ve decided to write up my experience on this topic and test the reactions.

So, the topic in question is “what is the right way to store user passwords in my DB”. To be clear, I am talking specifically about the passwords users will use to log in to your application, not some 3rd party password you need to store for whatever reason. This is something almost any application out there requires – unless you interface with some external authentication mechanism (OAuth, openId, your office LDAP or Kerberos server), there’s a very high chance you’ll need to authenticate users against a self-stored user name and password.

Continue reading

My PHP Streams API article was published by php|architect

php|architect, one of the most prominent professional PHP magazines in the world, has published an article I wrote about PHP’s user-space Streams API in its December 2011 issue:

Go with the Flow: PHP’s Userspace Streams API

Almost every PHP application out there needs to read data from files or write data to files – or things that look like files but are not quite files – these unstructured blobs of data are commonly referred to as “streams”. Stream functions allow a scalable, portable and memory efficient way to handle data, and pretty much any PHP developer out there knows how to read data from or write data to a steam. The best part is that you don’t have to be an extension author in order to provide access to any data source as if it was just a regular file. PHP’s userspace streams API allows you do to exactly that, and this article will show you how.
If you’re a subscriber, feel free to read the article and send me your feedback. If not, go ahead an buy the issue :)

XPath regular expression matching in PHP 5.3

Recently I needed to do some text pattern matching in an XML XPath query, and XPath’s built-in sub-string matching capabilities were not good enough.

While XPath 2.0 defines regular expression matching capabilities, it is still not widely implemented and in most available tools there is no easy way to do complex pattern matching on XML nodes.

Or is there?

In his blog Thomas Weinert recently gave an intro to using DOM and its XPath capabilities in PHP, but one of the cool features of DOM’s XPath, available starting from PHP 5.3.0 (have you upgraded yet?), is that the DOM extension supports registering pretty much any PHP function with the XPath engine, and using it inside XPath queries.

Here is a quick example showing usage of PHP’s own preg_match() in an XPath query, to find all the external links in Wikipedia’s PHP article:

// Supress XML parsing errors (this is needed to parse Wikipedia's XHTML)
libxml_use_internal_errors(true);

// Load the PHP Wikipedia article
$domDoc = new DOMDocument();
$domDoc->load('http://en.wikipedia.org/wiki/PHP');

// Create XPath object and register the XHTML namespace
$xPath = new DOMXPath($domDoc);
$xPath->registerNamespace('html', 'http://www.w3.org/1999/xhtml');

// Register the PHP namespace if you want to call PHP functions
$xPath->registerNamespace('php', 'http://php.net/xpath');

// Register preg_match to be available in XPath queries 
//
// You can also pass an array to register multiple functions, or call 
// registerPhpFunctions() with no parameters to register all PHP functions
$xPath->registerPhpFunctions('preg_match');

// Find all external links in the article  
$regex = '@^http://[^/]+(?<!wikipedia.org)/@';
$links = $xPath->query("//html:a[ php:functionString('preg_match', '$regex', @href) > 0 ]");

// Print out matched entries
echo "Found " . (int) $links->length . " external linksnn";
foreach($links as $linkDom) { /* @var $entry DOMElement */
    $link = simplexml_import_dom($linkDom);
    $desc = (string) $link;
    $href = (string) $link['href'];
    
    echo " - ";
    if ($desc && $desc != $href) {
        echo "$desc: ";
    } 
    echo "$href\n";
}

Note the use of php:functionString() as an XPath function, calling preg_match(). functionString() will pass XML entities such as @href as a string into the function, which is different from calling php:function() which, as far as I have seen, will pass parameters without casting them to a string first (however I am not sure what exactly they are passed as… maybe someone who knows can elaborate?).

Pretty useful huh?

Debugging CLI PHP with Zend Server and PDT on Linux and Mac

I’m working on a small PHP application and a big part of it are some CLI scripts which will be executed in the background. Some of these scripts are quite complex, and I got to a point where I need to use a debugger in order to figure out what’s going on.

I started hacking around with my locally-installed Zend Server CE and Zend Studio. I always knew how to manually start CLI debug sessions with Zend Studio (well, I knew, but forgot ;-) ), but then I figured, why not write a small shell script to automate the process, and learn a little about the Zend Debugger protocol on the way?

Here is what I did:

First, create the following shell script. I placed it at /usr/local/zend/bin/php-dbg (alongside the other Zend Server executables, which if you use Mac OS X will be at /Applications/ZendServer/bin):

#!/bin/sh

# Wrapper script for debugging PHP CLI scripts with Zend Studio
# Tested with Zend Server 4.0.0 Beta and Zend Studio for Eclipse 6.1.1
# Shahar Evron [shahar.e at zend], 2009-02-20

# Defaults
DFLT_PORT="10137"
DFLT_HOST="127.0.0.1"
DFLT_PARAMS="debug_fastfile=1&use_tunneling=0"

# Load Zend Server environment variables
. /etc/zce.rc

# Did the user specify the debug host / port?
if test "x$DEBUG_HOST" != "x"; then
  if test "x$DEBUG_PORT" != "x"; then
    QUERY_STRING="&debug_port=$DEBUG_PORT"
  else
    QUERY_STRING="&debug_port=$DFLT_PORT"
  fi

  QUERY_STRING="$QUERY_STRING&debug_host=$DEBUG_HOST&$DFLT_PARAMS"

# If no host/port were specified, try to auto-detect
else
  QUERY_STRING=`wget http://localhost:20080/ -O - 2> /dev/null`
  if test $? -ne 0; then
    # Fall back to defaults
    echo "Unable to auto-detect Zend Studio settings, using defaults" >&2
    QUERY_STRING="&debug_port=$DFLT_PORT&debug_host=$DFLT_HOST&$DFLT_PARAMS"
  fi
fi

DBG_SESS_ID=`date +%s`
QUERY_STRING="start_debug=1&debug_stop=1$QUERY_STRING&debug_session_id=$DBG_SESS_ID" 

QUERY_STRING=$QUERY_STRING $ZCE_PREFIX/bin/php -c $ZCE_PREFIX/etc/php.ini $@

Going over this code might teach you some surprising things about how Zend Debugger and Zend Studio talk to each other ;) I’m not going to go into the details now, but if you have questions feel free to ask.

Next, make this script executable – just run ‘chmod +x <path-to-script>‘ – and you’re good to go.

Here is how to use it:

  • If you have PDT or Zend Studio running locally (on the same machine as the server), just run:

    # /usr/local/zend/bin/php-dbg <script you want to debug>

    That would just work in most cases – if it works you can stop reading now ;-)
  • If you are running the script on a server, but your PDT / Zend Studio is on a different machine (in the same LAN – no NAT or firewall!) you can simply specify the IP address or host name of the machine that runs PDT / Zend Studio as the DEBUG_HOST environment variable. For example:

    # DEBUG_HOST=10.1.2.3 /usr/local/zend/bin/php-dbg <script you want to debug>
  • If you are running the script on a remote machine (as above) and your Zend Studio listens on a port other than 10137, you can also pass the DEBUG_PORT environment variable to override the default port.
  • Also, don’t forget to make sure that the machine that runs your Zend Studio is in the list of allowed debugging clients. You can check it at the Zend Server GUI on Server Setup -> Debugger.
  • If you are running the script on a remote host and there’s a firewall / NAT between you and the server (e.g. you are in an office LAN, trying to debug a script on a remote production machine which is not in your subnet) you’ll probably need to use SSH remote port forwarding to forward connections to your PDT / Zend Studio. I won’t get into how to do it right here – unless you insist.
  • If you want to only type ‘php-dbg’ when running instead of the full path, you can place the file in your $PATH (e.g. in /usr/local/bin) or even better, Add /usr/local/zend/bin (or /Applications/ZendServer/bin) to your $PATH – you can do that by adding the following line to ~/.bashrc:

    PATH=$PATH:/usr/local/zend/bin

Upon running the script, a debug session should simply pop-up in your PDT / Studio and you’ll be able to debug. How cool is that?

BTW: This has been tested with Zend Server 4.0.0 beta1 and Zend Studio 6.1.1. It should work with other versions of Studio as well. In fact, it can also work without Zend Server as long as you have Zend Debugger installed – but why ruin a perfectly good plug?

If you improve the script or find bugs, let me know! Also, if you know how to get the same thing going with xDebug, let me know and I’ll add it to the script.