Compressing Webpages for Fun and Profit
(Written by the Guest Goddess, Photo Matt. Please note: You need to have PHP on your server to do this. No PHP? Won't work.)
So your page is now totally pimped out. You have gads of content on your sidebar, you've used ScriptyGoddess know-how to have comments and extended entries pop out like magic, and you even have some entries to take up some space between all the gadgets. The problem? The code on your page is now weighing in at half a meg and you can actually hear people cry when they load your site with a modem. You start to think about what features you could take out, maybe cutting out entries on the front page, but what if I told you that you can third your content easily with no work on your part whatsoever? It sounds like a pitch I might get in a lovely unsolicited email. The secret lies in the fact that every major browser of the past 5 years supports transparently decompressing content on the fly. There are three ways to do it—easy, right, and weird—and we'll cover all three here. Before we even get started you should check for compression of your pages, because if it's already happy it's probably best to not fix what ain't broken.
Easy
<?php ob_start("ob_gzhandler"); ?>
I hate to be anti-climatic, but that's it. Put that at the very top of your PHP-parsed page that you want to compress and that's it. The only thing to watch for is it really does have to be at the top, or the sky will fall. Actually before you call me Chicken Little, you'll probably just get a cryptic "headers already sent" error, but you can never be too careful. Basically what this magical line of code does is start an output buffer which takes all your content, checks if the client can receive compressed content, and if it can it zips up the buffer and sends it on its merry way. This can be a great technique to curb your bandwidth usage to; I've seen it save gigabytes on content-heavy sites.
Right
While the overhead associated with the above is minimal, if you'd like to see the benefits of compressed content on a larger scale, mod_gzip is the way to go. Mod_gzip is an Apache module which will compress files whether they are CGI scripts, processed by PHP, static HTML or text files, whatever it can. It is completely transparent to both the user and client, and it supports sophisticated configuration to allow it to be tweaked to your heart's content. However if you don't have permissions on your box to compile modules and modify httpd.conf, this option is unavailable, but don't let that stop you from bugging your host to include it, as there is really no good reason to not include it. It's always faster to send a smaller file. If you're interested in writing your own Apache module, studying mod_gzip is a great way to learn as it has extremely informatative debug code.
Alternative
There are certain circumstances where output buffering, which by definition has to wait for everything to process before it sends anything to the browser, can cause a perceivable delay in viewing scripts that take a while to run. With mod_gzip this isn't a problem because it streams content as it comes to it, and using PHP it doesn't have to be a problem either because it offers an alternative method of compressing and sending content, called zlib output compression. It's a little trickier to enable though, because there is no good way enable or disable it with straight PHP code, so the way we're going to do here is use .htaccess to modify the php.ini configuration. Instead of waiting until everything is finished, zlib output compression can take the content as chunks and send them as it comes to it. Here's what you need to put in your .htaccess file:
<FilesMatch "\.(php|html?)$">
php_value zlib.output_compression 4096
</FilesMatch>
Basically what this code says is if the file ends in php, htm, or html turn zlib output compression on and stream it out every 4 kilobytes. It's common to see a 2K buffer suggested on the web but I've found the overhead with that is higher, and this is a nice balance. You should know that this is the slowest of the three methods, but by slow I mean it adds .003 seconds instead of .001, so it's not really that big of a deal.
So now you have a faster site that's more fun to visit, and you're saving money on bandwidth. You can sit back now and wait for the love letters to pour in from your readers saying how much faster everything is loading. Enjoy!
Geek Notes
- Like with so many other things, Netscape 4 really screws up gzip encoding in a lot of ways, but you can avoid 99% of its problems simply by making sure that you don't gzip any linked JS or CSS files and you should be alright. On a more technical level, early versions of Netscape 4 try to use the browser cache to store compressed content before decompressing it, which works unless you have your browser's cache turned off, and then it will do something crazy. Note that this behavior even varies from version to version of Netscape 4, so overall I wouldn't worry about it.
- If you're doing things over SSL and you want to use mod_gzip as well, you have a little hacking to do.
- PHP.net documentation on ob_gzhandler and zlib output compression (they recommend using zlib).
- Things like images, zip files, and Florida ballots are already highly compressed so trying to compress them again might actually make them bigger. And then you have to recount.
- Avoid compressing PDF files as well because sometimes Internet Explorer on Windows (the 900-pound gorilla) forgets to decompress them before the Acrobat plugin takes over.
- According to the RFC, technically compressed content should be sent using transfer encoding rather than content encoding, since technically that's what is going on. One browser engine supports this, can you guess which one?
- Internet Explorer on Mac doesn't support any sort of content compression like the methods described above, but that's okay because all of the above methods intelligently look for the HTTP header that signals the client can accept gzip encoding, and if it isn't there—like in IE Mac, handheld browsers, whatever—they just sit idly by.
September 23rd, 2003 at 7:28 am
Scriptygoddess.com is an amazing resource for bloggers and people working with the web. Lots of little scripts and hacks to get your job done. A while back, I tried installing mod_gzip to compress my web pages and aid the children…
August 15th, 2003 at 11:51 pm
I just used this very simple method of compressing PHP pages to get my other blog's front page down to 18K from 80K. Your dial-up readers will love you for doing this, and you'll cut down on your bandwidth usage. Thanks to Charles (http://charles.gag…
July 12th, 2003 at 10:58 am
Andrea and Tim's bandwidth scare drew my attention to the concerns of page obesity. Apparently, SR kicks out at around 200k per serving. Supersize it, baby. At this rate, its only a matter of time before the sharks circling the…
April 8th, 2003 at 4:28 pm
I gzipped my front page. It should make things load more quickly. Can anyone say that it has?…
July 3rd, 2003 at 9:34 am
scriptygoddess What a helpful read, thanks! Just got mod_gzip setup on my site, and it cut the page sizes by about 70% according to the logs. WOW! I really wish …
April 7th, 2003 at 8:09 pm
Following M.'s (anything but ordinary) lead, I have gzipped my front page in the hopes of making it load faster. If you notice a difference in load-time, could you let…
April 8th, 2003 at 2:12 pm
I changed two things. So hopefully you should see some improvements on how fast the front pages loads. I used…
April 2nd, 2003 at 4:18 pm
Photo Matt has written an article called Compressing Webpages for Fun and Profit (Posted by Christine)….
April 7th, 2003 at 11:58 am
Lisa just reminded me that I neglected to tell you all about the site changes I made this weekend. I'm actually still working on some things, but I wanted to let you know what I've done so far. The Zonkboard…
March 31st, 2003 at 9:24 pm
I added a little snip php code to the beginning of all your index pages, in some cases I had…
March 31st, 2003 at 12:06 pm
Just last week I mentioned using GZip to compress my website. Now it seems that scientist are asking File Compression:…
March 30th, 2003 at 8:46 am
I've enabled compression on my .php and .html pages so that hopefully they will load a little faster and
March 31st, 2003 at 7:10 am
I've put some code on my page to speed it up. Let me know if it seems any faster. Want…
March 29th, 2003 at 6:02 pm
For all those who, whenever they load this lovely site clock how long it takes, you'll have noticed a sudden
March 29th, 2003 at 11:58 pm
Does the site load any faster for you guys? I used that little hack over at ScriptyGoddess…
March 30th, 2003 at 3:41 am
Remember how earlier this week I was praising Matt for optimizing my site. He is finally ready to share his secrets. You too can speed up your site, with one
March 29th, 2003 at 10:19 am
Totally amped to get this loaded into our websites and ((hopefully)) speed up our weblogs for those of you with
March 29th, 2003 at 10:21 am
Totally amped to get this loaded into our websites and ((hopefully)) speed up our weblogs for those of you with
March 29th, 2003 at 5:56 am
Also irgendwie fühle ich mich ein wenig ertappt: So your page is now totally pimped out. You have gads of
March 29th, 2003 at 10:13 am
Totally amped to get this loaded into our websites and ((hopefully)) speed up our weblogs for those of you with
March 29th, 2003 at 3:26 am
One of the new 'features' of my host account is a monthly transfer limit of 2GB – f2s had no formal limit but if you sucked up excessive bandwidth you would be charged extra. Up until the old host account…
March 29th, 2003 at 12:05 am
Compress that site! Watch yourself!
March 28th, 2003 at 11:50 pm
ScriptyGoddess has provided me with the know-how to compress TooMuchSexy and make everyone's web-browsing that much faster. All it requires you to do is add one line of code and have PHP. Unfortunatley it does not work in MacIE or Safari. The code is a…
March 28th, 2003 at 9:29 pm
"So your page is now totally pimped out. You have gads of content on your sidebar, you?ve used ScriptyGoddess know-how to have comments and extended entries pop out like magic, and you even have some entries to take up some space between all the gadget…
March 28th, 2003 at 10:17 pm
I'm gziped now — are you? (Link via Christine and Scripty Goddess.)
March 28th, 2003 at 11:41 pm
Thanks to Photo Matt and Christine my site should be loading between 3 and 20 times times faster (or so says the mod_gzip docs). All it took was a little…
March 28th, 2003 at 11:48 pm
ScriptyGoddess has provided me with the know-how to compress TooMuchSexy and make everyone's webbrowsing that much faster. All it requires you to do is add one line of code and have PHP. Unfortunatley it does not work in MacIE or Safari. The code is as…
March 28th, 2003 at 9:29 pm
"So your page is now totally pimped out. You have gads of content on your sidebar, you?ve used ScriptyGoddess know-how to have comments and extended entries pop out like magic, and you even have some entries to take up some space between all the gadget…
March 28th, 2003 at 8:59 pm
Remember when I said that PhotoMatt implemented voodoo code on my site to make it faster? Did you notice how…
March 16th, 2004 at 5:35 pm
ohh i know that gzip is working i used the port80 soft website to check but a cgi file included in one of my pages causes a page to show blank when i use this http compression method
March 16th, 2004 at 5:05 pm
A quick way to test whether your mod_gzip is working is via a telnet session:
$> telnet <your host> 80
GET / HTTP/1.1
Host: <your host>
Accept-Encoding: gzip
Connection: close
where you replace <your host> with your server's name. If you see a load of gibberish returned, like this:
HTTP/1.1 200 OK
Date: Tue, 16 Mar 2004 21:57:13 GMT
Server: Apache/1.3.29 (Unix) mod_gzip/1.3.26.1a PHP/4.3.4 mod_ssl/2.8.16 OpenSSL/0.9.7c
X-Powered-By: PHP/4.3.4
Connection: close
Content-Type: text/html
Content-Encoding: gzip
Content-Length: 1024
W]s£6}¿B«vv@;Íî?ªxÂ$*oÍx¥i;'F:þ]°wQm2ý§PÊø4õ°TO) ¥
Ö^°¾lúÄ=ÖGÐÞ¹^³³¬ZÐL½+5æ?¼ªÂÊx®5CO?¼ìW>ùh{æã¿ÜÛ ¶nQP²{Xèý:l#)Â
gð~ý&ÍÕuÉÚôÜÙWõëQY¾Í
then you know your gzip encoding is working
March 16th, 2004 at 7:51 am
the scripts is great but it has a problem with my cgi include is there a way to fix this i need that include but i also need to compress the page its way too big
September 26th, 2003 at 10:45 pm
Hi! I am trying desperately how to turn off mod_gzip in my .htacess file. My webhost refuses to turn it off for me on his end and one of my syndication sites is receiving a parsing error because of this.
I know this article is about enabling but can anyone disable? Thanks!
Mandy
April 1st, 2003 at 4:23 pm
Sekimori, in addition to the bandwidth conserved that Adam mentioned you'll find that especially when communicating to clients over thin pipes mod_gzip will actually reduce server load because that'll be anywhere from 50-85% less time the server's TCP/IP subsystem has to remain open for the request. Plus if you're still concerned about it increasing load you can tweak the gzip settings to be lower than the default of 6, look for the line
gz1->level = 6
and play with it to your heart's content. My personal tests with Apachebench plus my research on the matter have made it a no-brainer to enable mod_gzip on the servers I administer.
April 1st, 2003 at 3:33 pm
Sekimori… but wouldn't that slight increase in server usage be rather offset by the dramatic DECREASE in bandwidth? Using gzip on my site (http://blog.smilezone.com/) has cut my bandwidth usage literally in half.
April 1st, 2003 at 3:28 pm
Actually, there is a good reason some hosts won't install the mod_gzip module…if it's available, people will use it, and use it a lot, resulting in heaver loads on servers. In a shared hosting environment, that's A Bad Thing. Never hurts to ask though, just know this is why we say no.
March 31st, 2003 at 5:28 pm
I used the easy method, but didn't notice a change. Probably because I'm on a high speed connection.
However, some people noticed the decrease in load time for my site, so looks like it works =D
April 1st, 2003 at 5:16 am
Hey, I'm using the 3rd "alternative" option as I don't have access to my httpd.conf and it seems a lot easier than the easy way. It works fine for everything and it all loads faster, all but one page which includes a cgi file.. (my guestbook) now this page (http://www.iamsparticus.co.uk/guestbook/index.php) works fine in opera 7 but in MSIE (any version it seems) it fails miserabley and wont' load. I thought I should be able to fix it, by putting a different .htaccess file with modified code in the guestbook subdirectory. However I still get the same problem. the code I've stuck in the htaccess file in that directory is…
<FilesMatch "\.(html?)$">
php_value zlib.output_compression 4096
</FilesMatch>>
Any ideas on how to get it so it doesn't zip this one directory?
March 29th, 2003 at 8:00 pm
BTW- I am using skins, but I did put the php line above the cookiecheck like you said…
March 29th, 2003 at 7:51 pm
I used the first option, and I don't see any difference whatsoever… I am using a T1 and my page loads at .36 seconds so of course I wouldn't care, but it takes 10 seconds to load with a 56k modem and that's… plain evil. Like you said, I put the php code first line of the page, and I used your link to check again and it still appears to take the same amount of time to load… Should I use another method or?
March 29th, 2003 at 4:02 pm
Currently the web optimization link doesn't check for gzip encoding, though it really should and might in the future. The leknor.com link should be used to see if your pages are being compressed.
Don, in your case your pages should be compressed, and your server is reporting mod_gzip is installed. I would contact your host and have them check things. If you're running your own server drop me an email and I'll take a look at it; it's probably a problem with your httpd.conf file.
March 29th, 2003 at 3:43 pm
is it possible to do this with ASP?
March 29th, 2003 at 3:54 pm
Hmm, I have the mod_gzip module installed and running…. and the weigh-in pages say that my site is not using gzip…
sigh
March 29th, 2003 at 1:46 pm
Put this above the skinning code.
March 29th, 2003 at 1:06 pm
this sounds fabulous. 'cept – i skin my site now, and have to have the cookiecheck.php file at the top. if they *both* have to be at the top, how does one juggle this?
March 29th, 2003 at 1:21 am
Thanks for posting such a useful script! I've been reading this site for a while now, and you all never cease to amaze me!
March 29th, 2003 at 2:01 am
I'm somewhat embarrassed to write a "me too!" style entry… but let me just say that when I discovered this trick on the MT forum and implemented it on my site, there were two huge collective sighs of relief: One, from me… knowing that I no longer had to agonize of the bandwidth/feature balance so regularly… and two, from my dialup visitors, who are no doubt MUCH happier to be able to zip through my blog with at least twice as much speed
March 28th, 2003 at 11:22 pm
Thanks Christine and Matt! I was forced to resort to the 3rd option, but it appears to be working. Now I'm off to bug the folks at my hosting company to install mod_gzip.
Now about those love-letters…
March 29th, 2003 at 1:04 am
Oh – I can't take any credit for the post other than begging Matt to write it! HE gets all the credit for every word – I didn't edit it at all! MATT ROCKS!!!
March 29th, 2003 at 1:05 am
*blush*
March 28th, 2003 at 9:17 pm
Hmmm. I thought Matt had optimized me before, but I'm not any longer. I have the php line, though, but I'm not sure if it works.
March 28th, 2003 at 9:03 pm
WOW Christine. That is impressive! Thanks!
November 26th, 2004 at 5:22 am
I used the "Alternative" method and it worked fine until I realised it prevents one of my pages from displaying. The reason for this is the page uses a virtual() command to insert the output of a CGI script. So whilst the page itself has a .php extension, and thus passes through the FilesMatch filter, it falls over at the line containing the virtual() command when it is compressed.
Can anyone recommend a way to prevent this particular file from not being compressed, whilst allowing everything else to be compressed? [Its location is http://www.domain.com/search/index.php if that helps.]
February 8th, 2005 at 4:16 am
This is probably too late to help either of you but maybe it'll help a future person who runs across this. Greg, you probably have a virtual() call in your php. David, use include() instead of virtual(). It does the same thing, except that virtual() starts php inside an Apache module and include() invokes The Real PHP. For some reason this is better when you're using gzip output buffering. Caution: the preceding explanation is probably completely wrong technically, because I don't actually understand how this works, but it's what I've deduced.
February 17th, 2005 at 12:24 am
Wow…. I needed to compress a JS array of 290KB on a form of mine, and I found this page, and your one line of code, and bam, compressed to 45K
I didn't put it at the top of the page either, just at the beginning of the bit I needed compressing from. (OO style site).
So thankyou for saving me a lot of time
February 27th, 2005 at 8:42 am
I used the zlib compression method you posted above. It worked fine for my php files, however the html files don't get compressed