Compression is a simple, effective way to save bandwidth and speed up your site. I hesitated when recommending gzip compression when speeding up your javascript because of problems in older browsers.
But it’s 2007. Most of my traffic comes from modern browsers, and quite frankly, most of my users are fairly tech-savvy. I don’t want to slow everyone else down because somebody is chugging along on IE 4.0 on Windows 95. Google and Yahoo use gzip compression. A modern browser is needed to enjoy modern web content and modern web speed — so gzip encoding it is. Here’s how to set it up.
Wait, wait, wait: Why are we doing this?
Before we start I should explain what content encoding is. When you request a file like http://www.yahoo.com/index.html
, your browser talks to a web server. The conversation goes a little like this:
1. Browser: Hey, GET me /index.html
2. Server: Ok, let me see if index.html is lying around…
3. Server: Found it! Here’s your response code (200 OK) and I’m sending the file.
4. Browser: 100KB? Ouch… waiting, waiting… ok, it’s loaded.
Of course, the actual headers and protocols are much more formal (monitor them with Live HTTP Headers if you’re so inclined).
But it worked, and you got your file.
So what’s the problem?
Well, the system works, but it’s not that efficient. 100KB is a lot of text, and frankly, HTML is redundant. Every <html>, <table> and <div>
tag has a closing tag that’s almost the same. Words are repeated throughout the document. Any way you slice it, HTML (and its beefy cousin, XML) is not lean.
And what’s the plan when a file’s too big? Zip it!
If we could send a .zip file to the browser (index.html.zip) instead of plain old index.html, we’d save on bandwidth and download time. The browser could download the zipped file, extract it, and then show it to user, who’s in a good mood because the page loaded quickly. The browser-server conversation might look like this:
1. Browser: Hey, can I GET index.html? I’ll take a compressed version if you’ve got it.
2. Server: Let me find the file… yep, it’s here. And you’ll take a compressed version? Awesome.
3. Server: Ok, I’ve found index.html (200 OK), am zipping it and sending it over.
4. Browser: Great! It’s only 10KB. I’ll unzip it and show the user.
The formula is simple: Smaller file = faster download = happy user.
Don’t believe me? The HTML portion of the yahoo home page goes from 101kb to 15kb after compression:
The (not so) hairy details
The tricky part of this exchange is the browser and server knowing it’s ok to send a zipped file over. The agreement has two parts
- The browser sends a header telling the server it accepts compressed content (gzip and deflate are two compression schemes):
Accept-Encoding: gzip, deflate
- The server sends a response if the content is actually compressed:
Content-Encoding: gzip
If the server doesn’t send the content-encoding response header, it means the file is not compressed (the default on many servers). The “Accept-encoding” header is just a request by the browser, not a demand. If the server doesn’t want to send back compressed content, the browser has to make do with the heavy regular version.
Setting up the server
The “good news” is that we can’t control the browser. It either sends the Accept-encoding: gzip, deflate
header or it doesn’t.
Our job is to configure the server so it returns zipped content if the browser can handle it, saving bandwidth for everyone (and giving us a happy user).
In Apache, enabling output compression is fairly straightforward. Add the following to your .htaccess file:
# compress all text & html:
AddOutputFilterByType DEFLATE text/html text/plain text/xml
# Or, compress certain file types by extension:
<Files *.html>
SetOutputFilter DEFLATE
</Files>
Apache actually has two compression options:
- mod_deflate is easier to set up and is standard.
- mod_gzip seems more powerful: you can pre-compress content.
Deflate is quick and works, so I use it; use mod_gzip if that floats your boat. In either case, Apache checks if the browser sent the “Accept-encoding” header and returns the compressed or regular version of the file. However, some older browsers may have trouble (more below) and there are special directives you can add to correct this.
If you can’t change your .htaccess file, you can use PHP to return compressed content. Give your HTML file a .php extension and add this code to the top:
In PHP:
<?php if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); ?>
We check the “Accept-encoding” header and return a gzipped version of the file (otherwise the regular version). This is almost like building your own webserver (what fun!). But really, try to use Apache to compress your output if you can help it. You don’t want to monkey with your files.
Verify Your Compression
Once you’ve configured your server, check to make sure you’re actually serving up compressed content.
- Online: Use the online gzip test to check whether your page is compressed.
- In your browser: Use Web Developer Toolbar > Information > View Document Size (like I did for Yahoo, above) to see whether the page is compressed.
- View the headers: Use Live HTTP Headers to examine the response. Look for a line that says “Content-encoding: gzip”.
Be prepared to marvel at the results. The instacalc homepage shrunk from 36k to 10k, a 75% reduction in size.
Try Some Examples
I’ve set up some pages and a downloadable example:
- index.html - No explicit compression (on this server, I am using compression by default ).
- index.htm - Explicitly compressed with Apache .htaccess using *.htm as a rule
- index.php - Explicitly compressed using the PHP header
Feel free to download the files, put them on your server and tweak the settings.
Caveats
As exciting as it may appear, HTTP Compression isn’t all fun and games. Here’s what to watch out for:
- Older browsers: Yes, some browsers still may have trouble with compressed content (they say they can accept it, but really they can’t). If your site absolutely must work with Netscape 1.0 on Windows 95, you may not want to use HTTP Compression. Apache mod_deflate has some rules to avoid compression for older browsers.
- Already-compressed content: Most images, music and videos are already compressed. Don’t waste time compressing them again. In fact, you probably only need to compress the “big 3″ (HTML, CSS and Javascript).
- CPU-load: Compressing content on-the-fly uses CPU time and saves bandwidth. Usually this is a great tradeoff given the speed of compression. There are ways to pre-compress static content and send over the compressed versions. This requires more configuration; even if it’s not possible, compressing output may still be a net win. Using CPU cycles for a faster user experience is well worth it, given the short attention spans on the web.
Enabling compression is one of the fastest ways to improve your site’s performance. Go forth, set it up, and let your users enjoy the benefits.
相关推荐
如果客户端支持Gzip,它会在请求头中包含`Accept-Encoding: gzip`字段,服务器接收到这个请求后,会将资源(如JavaScript文件)用Gzip压缩后再返回给客户端。 **提高网站运行速度** 1. **减小文件大小**:Gzip能将...
本文将介绍 Nodejs 中关于 gzip deflate 压缩的两种方法:管道压缩和非管道压缩。 管道压缩 在 Nodejs 中,I/O 操作是异步的,需要通过回调函数来读取数据。当内存中无法一次装下需要处理的数据时,或者一边读取...
Web应用防火墙关于gzip文件的检测研究.pdf
为了确保与各种网络服务器的兼容性,你需要遵循HTTP协议中关于gzip编码的标准。当发送请求时,可以在HTTP头中设置`Accept-Encoding: gzip`,表示客户端支持gzip压缩。服务器响应时,如果返回的数据是gzip压缩的,将...
关于gzip的压缩原理,它采用的是DEFLATE算法,这是一种结合了LZ77(Lempel-Ziv)压缩和霍夫曼编码的方法。LZ77用于找出数据中的重复模式,通过建立滑动窗口内的匹配来减少数据量,而霍夫曼编码则根据数据出现的频率...
以下是关于gzip、VS2008以及源码编译的一些关键知识点: 1. **gzip**:gzip是由Jean-loup Gailly和Mark Adler开发的,是GNU项目的一部分。它的主要功能是对文件进行压缩,生成.gzip扩展名的压缩文件。gzip支持多种...
`www.pudn.com.txt`可能是一个包含关于gzip或者其他相关主题的文档或资源链接的文本文件,而`gzip`可能是指gzip源码的某个部分或者一个编译后的gzip可执行文件。通过阅读和分析这些文件,开发者可以深入理解gzip的...
- `代码中国.txt` 和 `代码中国.url` 可能是与 Gzip 相关的文本文件或书签,可能包含示例代码或关于 Gzip 在中国开发环境中的应用介绍。 总的来说,Gzip 是一个强大的数据压缩工具,对于优化网站性能、节省带宽资源...
标题"IIS6-SET-GZIP.rar"提示我们这个压缩包可能包含了与IIS6(Internet Information Services 6.0)相关的设置,特别是关于GZIP压缩的配置或工具。IIS6是微软在Windows Server 2003系统中提供的一个Web服务器版本,...
标签“gzip”表明这是关于gzip工具和其使用的技术。除了核心的压缩功能,gzip-1.2.4还包括了一些额外的工具,如: - `zmore.1`、`zless.1`:这些是查看压缩文本文件的工具,类似于Unix的`more`和`less`命令。 - `...
4. **man**:这是手动页(man pages)的集合,提供了关于GZIP工具及其命令的详细使用说明。用户可以通过man命令查阅这些文档,了解如何正确地使用GZIP。 在使用GZIP时,一些重要的知识点包括: - **命令行选项**:...
在标签“gzip”中,我们可以理解这是关于gzip工具的主题,它是一个命令行实用程序,可以压缩单个文件或整个目录。gzip不仅能够减少文件大小,还可以与其他Unix-like系统的工具(如tar)结合使用,形成更强大的数据...
最后,该文档强调了在实施gzip或gunzip操作时,开发者可以通过搜索引擎、Unix命令行工具或者Windows中的CygWin包以及GNU组织的官方网站来获取更多关于gzip的信息。 综上所述,ABAP类进行gzip压缩和gunzip解压缩是...
- **教程与文章**:阅读关于gzip源码分析的文章和教程,获取更深入的理解。 通过以上步骤,我们可以充分利用gzip-1.2.4的调试环境,深入理解gzip的工作原理,为源码分析和二次开发打下坚实基础。在实际操作过程中...
关于gzip的详细知识点包括: 1. **gzip工作原理**:gzip使用LZ77算法,这是一种基于查找重复字符串并替换为引用的压缩方法。它还包含CRC校验,以确保数据的完整性。 2. **命令行使用**:在Linux或Unix系统中,你...
头部包含了关于压缩文件的信息,如时间戳、文件名、CRC校验等。每个数据块都由压缩后的数据和相关的控制信息组成。尾部则包含了一个CRC校验和,用于验证解压过程是否正确无误。 现在,让我们转向gzip源码。gzip的...
在互联网通信中,为了减少数据传输量,提升网络性能,许多服务器会将网页内容...同时,文件"AnalysisHtml"可能包含了一个关于如何分析和处理HTML内容的示例或教程,这对于理解libcurl在实际项目中的应用非常有帮助。
当客户端(浏览器)支持gzip或deflate压缩时,它会在请求头中声明`Accept-Encoding`字段,表明可以接受的编码方式,如`gzip, deflate`。 0x02. 管道压缩 管道压缩是Node.js中处理数据流的一种高效方式。Node.js的I/...