scriptygoddess

28 Mar, 2003

Compressing Webpages for Fun and Profit

Posted by: Christine In: How to's

(Written by the Guest Goddess, Photo Matt. Please note: You need to have PHP on your server to do this. No PHP? Won't work.)

So your page is now totally pimped out. You have gads of content on your sidebar, you've used ScriptyGoddess know-how to have comments and extended entries pop out like magic, and you even have some entries to take up some space between all the gadgets. The problem? The code on your page is now weighing in at half a meg and you can actually hear people cry when they load your site with a modem. You start to think about what features you could take out, maybe cutting out entries on the front page, but what if I told you that you can third your content easily with no work on your part whatsoever? It sounds like a pitch I might get in a lovely unsolicited email. The secret lies in the fact that every major browser of the past 5 years supports transparently decompressing content on the fly. There are three ways to do it—easy, right, and weird—and we'll cover all three here. Before we even get started you should check for compression of your pages, because if it's already happy it's probably best to not fix what ain't broken.

Easy

<?php ob_start("ob_gzhandler"); ?>

I hate to be anti-climatic, but that's it. Put that at the very top of your PHP-parsed page that you want to compress and that's it. The only thing to watch for is it really does have to be at the top, or the sky will fall. Actually before you call me Chicken Little, you'll probably just get a cryptic "headers already sent" error, but you can never be too careful. Basically what this magical line of code does is start an output buffer which takes all your content, checks if the client can receive compressed content, and if it can it zips up the buffer and sends it on its merry way. This can be a great technique to curb your bandwidth usage to; I've seen it save gigabytes on content-heavy sites.

Right

While the overhead associated with the above is minimal, if you'd like to see the benefits of compressed content on a larger scale, mod_gzip is the way to go. Mod_gzip is an Apache module which will compress files whether they are CGI scripts, processed by PHP, static HTML or text files, whatever it can. It is completely transparent to both the user and client, and it supports sophisticated configuration to allow it to be tweaked to your heart's content. However if you don't have permissions on your box to compile modules and modify httpd.conf, this option is unavailable, but don't let that stop you from bugging your host to include it, as there is really no good reason to not include it. It's always faster to send a smaller file. If you're interested in writing your own Apache module, studying mod_gzip is a great way to learn as it has extremely informatative debug code.

Alternative

There are certain circumstances where output buffering, which by definition has to wait for everything to process before it sends anything to the browser, can cause a perceivable delay in viewing scripts that take a while to run. With mod_gzip this isn't a problem because it streams content as it comes to it, and using PHP it doesn't have to be a problem either because it offers an alternative method of compressing and sending content, called zlib output compression. It's a little trickier to enable though, because there is no good way enable or disable it with straight PHP code, so the way we're going to do here is use .htaccess to modify the php.ini configuration. Instead of waiting until everything is finished, zlib output compression can take the content as chunks and send them as it comes to it. Here's what you need to put in your .htaccess file:

<FilesMatch "\.(php|html?)$">
php_value zlib.output_compression 4096
</FilesMatch>

Basically what this code says is if the file ends in php, htm, or html turn zlib output compression on and stream it out every 4 kilobytes. It's common to see a 2K buffer suggested on the web but I've found the overhead with that is higher, and this is a nice balance. You should know that this is the slowest of the three methods, but by slow I mean it adds .003 seconds instead of .001, so it's not really that big of a deal.

So now you have a faster site that's more fun to visit, and you're saving money on bandwidth. You can sit back now and wait for the love letters to pour in from your readers saying how much faster everything is loading. Enjoy!

Geek Notes

  • Like with so many other things, Netscape 4 really screws up gzip encoding in a lot of ways, but you can avoid 99% of its problems simply by making sure that you don't gzip any linked JS or CSS files and you should be alright. On a more technical level, early versions of Netscape 4 try to use the browser cache to store compressed content before decompressing it, which works unless you have your browser's cache turned off, and then it will do something crazy. Note that this behavior even varies from version to version of Netscape 4, so overall I wouldn't worry about it.
  • If you're doing things over SSL and you want to use mod_gzip as well, you have a little hacking to do.
  • PHP.net documentation on ob_gzhandler and zlib output compression (they recommend using zlib).
  • Things like images, zip files, and Florida ballots are already highly compressed so trying to compress them again might actually make them bigger. And then you have to recount.
  • Avoid compressing PDF files as well because sometimes Internet Explorer on Windows (the 900-pound gorilla) forgets to decompress them before the Acrobat plugin takes over.
  • According to the RFC, technically compressed content should be sent using transfer encoding rather than content encoding, since technically that's what is going on. One browser engine supports this, can you guess which one?
  • Internet Explorer on Mac doesn't support any sort of content compression like the methods described above, but that's okay because all of the above methods intelligently look for the HTTP header that signals the client can accept gzip encoding, and if it isn't there—like in IE Mac, handheld browsers, whatever—they just sit idly by.

56 Responses to "Compressing Webpages for Fun and Profit"

1 | Mike's QuickLinks

August 15th, 2003 at 11:51 pm

Avatar

Compress Your Web Pages
I just used this very simple method of compressing PHP pages to get my other blog's front page down to 18K from 80K. Your dial-up readers will love you for doing this, and you'll cut down on your bandwidth usage. Thanks to Charles (http://charles.gag

2 | /bin/true

September 23rd, 2003 at 7:28 am

Avatar

Compressing web pages
Scriptygoddess.com is an amazing resource for bloggers and people working with the web. Lots of little scripts and hacks to get your job done. A while back, I tried installing mod_gzip to compress my web pages and aid the children…

3 | David

November 26th, 2004 at 5:22 am

Avatar

I used the "Alternative" method and it worked fine until I realised it prevents one of my pages from displaying. The reason for this is the page uses a virtual() command to insert the output of a CGI script. So whilst the page itself has a .php extension, and thus passes through the FilesMatch filter, it falls over at the line containing the virtual() command when it is compressed.

Can anyone recommend a way to prevent this particular file from not being compressed, whilst allowing everything else to be compressed? [Its location is http://www.domain.com/search/index.php if that helps.]

4 | evariste

February 8th, 2005 at 4:16 am

Avatar

This is probably too late to help either of you but maybe it'll help a future person who runs across this. Greg, you probably have a virtual() call in your php. David, use include() instead of virtual(). It does the same thing, except that virtual() starts php inside an Apache module and include() invokes The Real PHP. For some reason this is better when you're using gzip output buffering. Caution: the preceding explanation is probably completely wrong technically, because I don't actually understand how this works, but it's what I've deduced.

5 | Fitzy

February 17th, 2005 at 12:24 am

Avatar

Wow…. I needed to compress a JS array of 290KB on a form of mine, and I found this page, and your one line of code, and bam, compressed to 45K :)

I didn't put it at the top of the page either, just at the beginning of the bit I needed compressing from. (OO style site).

So thankyou for saving me a lot of time :)

6 | Ajay DSouza

February 27th, 2005 at 8:42 am

Avatar

I used the zlib compression method you posted above. It worked fine for my php files, however the html files don't get compressed :(

Featured Sponsors

Genesis Framework for WordPress

Advertise Here


  • Scott: Just moved changed the site URL as WP's installed in a subfolder. Cookie clearance worked for me. Thanks!
  • Stephen Lareau: Hi great blog thanks. Just thought I would add that it helps to put target = like this:1-800-555-1212 and
  • Cord Blomquist: Jennifer, you may want to check out tp2wp.com, a new service my company just launched that converts TypePad and Movable Type export files into WordPre

About


Advertisements