efficient JS minification using PHP

Here’s a method to create minified JS files on-the-fly, without incurring the time cost of the minification process.

To explain, minification is the act of removing all whitespace from JavaScript so the file can be downloaded and parsed as quickly as possible.

A useful part of minification is that during the act of compiling your minified source, you can also pull in other JavaScript files and compiled them all into one single source. This has a major advantage that there is only one file to download. This is very important because even if those small files were small, the lag involved in sending multiple HTTP requests can sometimes be worse than if a single file which was twice the size of all the small files in total was downloaded. (which is why good web devs use Sprites when possible)

Right. That’s the explanation of /why/ to do it. Now how?

First off, here’s a very simple piece of source to show it (in a file called “all.php”).

  $js=file_get_contents('jquery-1.2.3.min.js');
  $js.=file_get_contents('jquery.dimensions.pack.js');
  $js.=file_get_contents('jquery.impromptu.js');
  $js.=file_get_contents('jquery.iutil.pack.js');
  $js.=file_get_contents('jquery.idrag.js');
  $js.=file_get_contents('jquery.grid.columnSizing.js');
  $js.=file_get_contents('jquery.tablesorter.js');
  require 'jsmin-1.1.1.php';
  $js=JSMin::minify($js);
  echo $js;

You can get jsmin-php from here. the other files are JQuery files but only used for illustrative purposes.

Anyone spot the problem? It does the job, yes, but the minify() script takes a few seconds to run, ruining the advantage it created with the minification.

The next step towards perfection is to cache the file. The obvious solution to that is to save the file in a location. If the file exists when the files are requested next, then serve the cached version instead of going through the minification process.

That’s not good enough, though. What if a file in the list was changed? You’d never know because the cached file will always be served.

The solution is to save the cached file in a file named using an MD5 of the last modified datetimes (genius idea – I’d love to shake the author’s hand). That way the cache will always be correct, and will automatically update itself when a file is changed.

$writabledir='/path/to/caches/';

function md5_of_dir($folder) {
  $dircontent = scandir($folder);
  $ret='';
  foreach($dircontent as $filename) {
    if ($filename != '.' && $filename != '..') {
      if (filemtime($folder.$filename) === false) return false;
      $ret.=date("YmdHis", filemtime($folder.$filename)).$filename;
    }
  }
  return md5($ret);
}

$name=md5_of_dir('./');
if(file_exists($writabledir.$name))readfile($writabledir.$name);
else{
  $js=file_get_contents('jquery-1.2.3.min.js');
  $js.=file_get_contents('jquery.dimensions.pack.js');
  $js.=file_get_contents('jquery.impromptu.js');
  $js.=file_get_contents('jquery.iutil.pack.js');
  $js.=file_get_contents('jquery.idrag.js');
  $js.=file_get_contents('jquery.grid.columnSizing.js');
  $js.=file_get_contents('jquery.tablesorter.js');
  require 'jsmin-1.1.1.php';
  $js=JSMin::minify($js);
  file_put_contents($writabledir.$name,$js);
  echo $js;
}

Better. This time, the delay is only on the first load. All subsequent loads will have an instant download of the cached file.

However, if you’re a developer, then almost every reload will involve the minification process – you’d never get your work done.

A solution is to check to see if the current MD5 cache exists, and if it doesn’t, then download the files as a bundle without minifying them. Instead, before sending the files, you tag on a little bit of javascript which will do the minification in the background after the script has been sent to the client.

Here is a complete solution, along with a little cleanup which removes caches older than an hour.

$writabledir='/path/to/caches/';

function delete_old_md5s($folder) {
  $olddate=time()-3600;
  $dircontent = scandir($folder);
  foreach($dircontent as $filename) {
    if (strlen($filename)==32 && filemtime($folder.$filename) && filemtime($folder.$filename)<$olddate) unlink($folder.$filename);
  }
}

function md5_of_dir($folder) {
  $dircontent = scandir($folder);
  $ret='';
  foreach($dircontent as $filename) {
    if ($filename != '.' && $filename != '..') {
      if (filemtime($folder.$filename) === false) return false;
      $ret.=date("YmdHis", filemtime($folder.$filename)).$filename;
    }
  }
  return md5($ret);
}

header('Content-type: text/javascript');
header('Expires: '.gmdate("D, d M Y H:i:s", time() + 3600*24*365).' GMT');

$name=md5_of_dir('./');
if(file_exists($writabledir.$name))readfile($writabledir.$name);
else{
  $js=file_get_contents('jquery-1.2.3.min.js');
  $js.=file_get_contents('jquery.dimensions.pack.js');
  $js.=file_get_contents('jquery.impromptu.js');
  $js.=file_get_contents('jquery.iutil.pack.js');
  $js.=file_get_contents('jquery.idrag.js');
  $js.=file_get_contents('jquery.grid.columnSizing.js');
  $js.=file_get_contents('jquery.tablesorter.js');
  if(isset($_REQUEST['minify'])){
    require 'jsmin-1.1.1.php';
    $js=JSMin::minify($js);
    file_put_contents($writabledir.$name,$js);
    delete_old_md5s($writabledir);
    exit;
  }
  else{
    $js.="setTimeout(function(){var a=document.createElement('img');a.src='all.php?minify=1';a.style.display='none';document.body.appendChild(a);},5000);";
  }
  echo $js;

A pseudo-code version of the above would be:

if file named after md5 of scripts directory exists, echo it and exit.
else{
  concatenate requested scripts into one large string.
  if request URI does not have "minify" as a parameter{
    add a javascript instruction to the string to load this script again with the minify parameter in 5 seconds
    print the string and exit.
  }
  else{
    minify the string.
    save the string to a file named after the MD5
    exit without printing anything (no reason to send this to the client)
  }
}

6 Comments.

  1. Interesting idea. I’m going to have to play around with this and see how it goes.

  2. Web 2.0 Announcer - trackback on May 22, 2008 at 4:17 pm
  3. We’ve been doing something very similar to this (including the use of a checksum to name the cached file) in our CMS for a while. Works great. We use YUI Compressor instead of JSMin (it’s a little bit safer), and we run our CSS files through it as well.

  4. roScripts - Webmaster resources and websites - trackback on May 23, 2008 at 5:15 am