PHP Memory Usage & Unnecessary String Concatenation

As PHP developers, especially if like me you don’t come from hard-core Comp Sci background, we are initially trained not to worry about memory. We do not allocate it, do not release it – in fact we rarely even worry about closing files and DB connections, and we hardly ever care, as many C programmers would, about the real-memory size of the different variable types we use. That’s a good thing too – in Web environments, these things tend to be negligible and putting time into optimizing them would be a wasteful micro-optimization.

However, it is also important to be mindful of the fact that most PHP servers are memory-bound. That is, a Web server running Apache or nginx and a pool of PHP processes (whether these are `php-fpm` processes or mod_php Apache forks) is most likely limited by how much memory is available to spawn more PHP processes. The number of concurrent PHP processes directly correlates to the number of concurrent requests a server can handle. On its own, each PHP process takes a few (I would guess 5-15) MBs of private memory, but very often the application code would require it to allocate many, many more MBs – just try to run your app with low memory_limit setting (the default is 128mb) and see what happens. Memory allocations have a lot of impact on PHP speed (as recent phpng benchmarks show), but speed aside, its important to remember memory hogging directly impacts your app’s hardware requirements.

For that reason, I think its a good idea to come up with a list of memory utilization good practices for PHP – we already have such “checklists” for security and for speed optimization, and I think that while micro-optimizations are usually worthless in the real world, following good practices when writing new code can save you the occasional meltdown. One good practice I’m going to suggest today is being mindful about where it is correct to use the oh-so-common operation of string concatenation.

I’d like to start with an example:

The key problem in this script is in line 18, where the data is written to the output file (the same applies if just using `echo` or `print()`): it is very common in PHP code to use concatenation when writing or printing strings. The script reads 10mb of data from file, then writes more or less the same data back to a different file. You don’t expect it to allocate much more than 10mb of PHP memory for that purpose, but here is what the output of the script is:

nanook:~ shahar$ php /tmp/test.php
Peak memory usage: 21,204,792 bytes

Hmm.. double the expected memory. Why is that? Well, what happens in line 18 is, a new (temporary) variable is created to contain the result of the concatenation operation of $data and “\n”. $data is not modified in place of course, so now we have the original $data and its copy with an added LF byte, which is used very briefly to be written out to a file and then deleted. Wow, that’s 10mb of copied data overhead just for a very brief time and for a very limited need and then deleted. Of course, if we spend a moment of thought on this code we realize that this concatenation is completely unnecessary, and with a very small change we can do this:

nanook:~ shahar$ php /tmp/test.php
Peak memory usage: 10,727,320 bytes

So, one more line of code, but half the memory usage. No allocation or de-allocation of memory and copying of $data also means faster execution in this case. And all thanks to avoiding the unnecessary concatenation.

There are many similar patterns which are very common in PHP code, and I’ve made many such mistakes in the past. Most of the time they are negligible but every now and then, when dealing with potentially large data (like when handling the contents of files or network / HTTP responses), they are not negligible at all, and while I don’t recommend combing your code for such patterns, being mindful of unnecessary concatenations while writing new code can be a good idea.

Its also a good idea to be aware of a small and somewhat less known feature of PHP’s ‘echo’ language construct: you can give it multiple, comma-separated arguments to print out. In fact a vast majority of concatenation operator usages within ‘echo’ arguments is wasteful. Line 10 in the above code is a good example: it could have been written as a single concatenated string, but there’s no need to: the arguments are echoed one by one to output and no memory is copied, and all thanks to a single character change.

Comments are closed.