Lossless compression of PHP files

19 Jul 2019 - by 'Maurits van der Schee'

How small can a PHP file get? I was wondering this, while building PHP-CRUD-API, a full-featured API in a single PHP file. PHP has a really nice feature called 'halt_compiler', which allows you to have gzip contents in your PHP file. In the code below I'm applying that technique to PHP code. This allows you to reduce the size of a PHP file without losing any of it's functionality.

Why would you do this?

The file gets smaller, but executing it will be slower as it is need to be uncompressed before it can be executed. I also think the opcode cache may have more trouble optimizing your code when you uncompress it on-the-fly. Despite these downsides, some people use similar techniques for obfuscating the source code they publish. So I guess you could use this script to prevent people from easily reading your source code. I have not found any other (good) use case (yet).

How it works

The PHP code contains a 'halt_compiler' statement. This is the separator between the code and the data part of the file. The code part is able to read the data (from it's own file) using the '__FILE__' constant. This is possible, because there is convenient '__COMPILER_HALT_OFFSET__' constant which indicated the position in the file where the 'halt_compiler' statement is located.

The data part of the file is actually the gzipped original PHP source code. The code part is responsible for uncompressing and executing it. To execute the code it gets written to a temporary file and then executed using an 'include' statement. This method is more robust than calling 'eval' in terms of error handling and it also means that we don't have to make any changes to the code (such as stripping PHP open/close tags).

How to use it

Copy the code below to a file called 'compress.php' and then on the command line run it like this:

$ php compress.php api.php 
compressed 'api.php' from 259 to 41 kbyte (16%)

As you can see compressing 'api.php' reduces it's file size to 16% of the original, which is typical for PHP code and the 'deflate' algorithm. And now run it again:

$ php compress.php api.php 
uncompressed 'api.php' from 41 to 259 kbyte (624%)

The script will detect operating on an already compressed file (it looks for the 'halt_compiler' statement) and will automatically switch to uncompression mode. Since the compression is lossless you will end up with the exact same file you started with.

The code

It is a small script (less than 40 lines), and it should be easy to copy/paste and adjust to your needs:

<?php
if (!isset($argv[1])) {
    echo "Usage: php ${argv[0]} [filename]\n";
    exit(1);
}
$filename = $argv[1];
if (!file_exists($filename)) {
    echo "ERROR: file '$filename' not found\n";
    exit(1);
}
$content = file_get_contents($filename);
$before = strlen($content);
if (strpos($content, '__halt_compiler();')) {
    $action = 'uncompressed';
    $content = explode('__halt_compiler();', $content)[1];
    $content = gzinflate($content);
    $after = strlen($content);
} else {
    $action = 'compressed';
    $content = gzdeflate($content);
    $start = <<<'EOF'
$f = fopen(__FILE__, 'r');
fseek($f, __COMPILER_HALT_OFFSET__);
$t = tmpfile();
$u = stream_get_meta_data($t)['uri'];
fwrite($t, gzinflate(stream_get_contents($f)));
include($u);
fclose($t);
__halt_compiler();
EOF;
    $content = '<?php ' . str_replace([' ', "\n"], '', $start) . $content;
}
$after = strlen($content);
file_put_contents($filename, $content);
$percentage = (int) (($after * 100) / $before);
$before = (int) ($before / 1024);
$after = (int) ($after / 1024);
echo "$action '$filename' from $before to $after kbyte ($percentage%)\n";

You can also find the code on Github:

https://github.com/mevdschee/compress.php

Enjoy!

PS: Liked this article? Please share it on Facebook, Twitter or LinkedIn.

TQ
dev.com