- 浏览: 851166 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
zjhzwx1212:
为什么用threadLocal后,输出值是从20开始的,而定义 ...
j2ee的线程安全--threadlocal -
aeoluspu:
不错 mysql 测试部分感觉不详细
用sysbench(或者super-smack)测试mysql性能 -
nanPrivate:
有没有例子,只理论,实践起来还是不会啊
JMS可靠消息传送 -
lwclover:
一个网络工程师 装什么b
postfix 如何删除队列中的邮件 -
maimode:
我也欠缺不少啊
理想的计算机科学知识体系
http://www.webperformanceinc.com/library/reports/moddeflate/
Summary
Enabling mod_deflate can reduce the bandwidth usage on a particular file by up to 70%, but also reduces the maximum load a server can handle and may actually reduce site performance if the site compresses large dynamic files.
Overview
This article examines the trade-off in CPU utilization versus bandwidth utilization when compressing content with mod_deflate in Apache. The test uses a set of static test files of in a range of sizes to simulate total page size, and measures server CPU utilization and bandwidth utilization across various traffic levels.
Methodology
The analysis is based on a series of load tests peformed in our test lab. We tested five sample file sizes, and ran the same test on each file twice - once without compression, and once with compression. In each case, we measured the bandwidth necessary to serve the load, CPU utilization, and hits per second. Since we are testing mod_deflate, all content is compressed on the fly and uncached. No dynamic content is used to avoid CPU utilization by the content generation scripts; however, since no content is cached, this effectively simulates large amounts of unique dynamic content. The target webserver, Apache 2.2.3 on CentOS 5.2, was rebooted between each test. For more details, see Appendix A.
Analysis
Below are the results for 10KB, 50KB, and 100KB file sizes. The dotted lines are with compression turned off, and the solid lines are with compression turned on.
10KB files
The chart below shows the impact of compression on CPU utilization and bandwidth consumption while serving 10KB files. At each load level, turning on compression decreased bandwidth consumption by ~55% and increased CPU load by a factor of 4x compared to the uncompressed equivalent.
50KB files
The chart below shows the impact of compression on CPU utilization and bandwidth consumption while serving 50KB files. At each load level, turning on compression decreased bandwidth consumption by ~65% and increased CPU load by a factor of ~10x compared to the uncompressed equivalent.
100KB files
The chart below shows the impact of compression on the CPU utilization and bandwidth consumption while serving 100KB files. At this high data rate, the CPU was overwhelmed during the second ramp-up. At the first load level, turning on compression decreased bandwidth consumption by ~63% and increased CPU load by a factor of ~30x compared to the uncompressed equivalent.
Non-Linearity of CPU Usage
Since gzip is a block compression algorithm, one might expect that the amount of CPU usage versus the data rate would show a roughly linear increase. However, the measured increase in CPU utilization appears to be non-linear as file size increases; further, the slope of the curve steepens across both load and file size.
There are a number of possible factors that could influence this data. One, of course, is CPU contention and scheduling. We cannot directly compare the compression of a single large file to simultaneous compressions of many smaller files due to the operating system overhead of scheduling and switching between threads and/or processes. Such overhead would add CPU utilization as load increased. Due to Apache's design, however, such overhead is inescapable, so it must be factored into real-world performance analysis.
The gzip algorithm is also sensitive to data type and structure, so it is possible that our data was significantly more difficult to compress in the larger files. The design of the test cases should prevent that from being a factor; the book data is as uniform as possible without being repetitive across the various file sizes (see Appendix A for details). Nonetheless, this remains a possibility.
Is It Worth It?
What are we getting for the use of these CPU resources? The gzip algorithm is known to be fast and efficient on text files, and indeed we see total bandwidth reductions of 60-70% on pure compressible content such as in our test. Clearly this has a sizable and easily quantifiable impact on the cost of the bandwidth to service the site.
Interestingly, if we measure the percentage of traffic reduction by file size, we find that Apache's zlib compression performance was not uniform; there is an apparent optimal efficiency around the 50KB file size.
While this graph is interesting, there are several possible explanations for the curve that do not directly involve the Apache compression algorithm design or performance. Since we are measuring the traffic output and not the size of the output files, the smaller files could simply be suffering from the increased overhead of many more files being compressed and the overhead of more packets being sent partially empty (i.e. less efficiently).
Another possible influence is the data itself; the gzip algorithm is sensitive to the type of data being compressed, and it may be that a section of the file is simply more easily compressed than the others. For the larger files, the sample size was smaller; we did not include the differences in bandwidth once the CPU was at 100%, because the comparison between the uncompressed bandwidth and the compressed bandwidth is no longer valid at that point.
Finally, there are the various options and settings of mod_deflate and zlib itself, which could impact the optimal file size and behavior of the compression. Thus we do not draw any conclusion from this data, other than more testing is needed to determine how the efficiency of the compression is affected by configuration across multiple types of data and file sizes.
Conclusion
The advantages of compression are lower bandwidth usage and faster data transfer for large pages, which should result in a better user experience. Steve Souders, Yahoo's chief performance Yahoo, recommends compression for those exact reasons. These results have been measured previously in a variety of tests, both casual and rigorous, and the results are fairly unanimous that compression is valuable (see Appendix C). However most of these tests only look at the bandwidth advantages of compression, and not the impact on the server, as we do in this report. Further, there is little information on how the type of file being compressed affects the CPU utilization, the impact of misconfiguration (compressing already-compressed files, or avoiding compression of files that could be compressed), or how static file compression costs can be mitigated with various caching schemes.
We recommend dynamic websites that are already CPU limited should be cautious when enabling compression, particularly where it could affect large files. The trend in the data is clear: sites that are compressing large dynamic content will be expending a significant amount of CPU on compression while under load. This CPU usage does not scale well and is almost certain to delay page generation, eliminating the advantages of compression on the user experience, but not the impacts on bandwidth. Such a site could easily end up in a situation where the perceived performance of the site under load is considerably worse than without compression. Conversely, more CPU power (or servers) must be deployed to support a given load and perceived performance level for a CPU-limited application where reduction of bandwidth usage is a priority. Using a hardware-based compression device, as is found on some load-balancers, would also be an option to assist with compression performance.
The traditional software solution for decreasing the CPU load from compression is to cache the result and serve that instead, as long as the file doesn't change. However, for modern websites, at least a portion of the content is dynamic and must be compressed on the fly if it is to be compressed at all. Thus there is a clear need to cleanly separate dynamic content from static content; to minimize the size of dynamic content as much as possible; and to be cautious when enabling mod_deflate in situations where the dynamic content is large. In a future report, we will look at the use of mod_cache can mitigate the performance impacts of enabling compression.
Appendix A:Methodology
Software Versions and Settings
- Zlib version 1.2.3-3 (default compression setting of 6)
- Apache 2.2.3-11.el5_1.centos.3
- pre-fork MPM
- StartServers 5
- ServerLimit 1000
- MaxClients 1000
- MaxRequestsPerChild 10000
- KeepAlive on
- KeepAliveRequests 0 (unlimited)
- KeepAliveTimeout 30
- AddOutputFilterByType DEFLATE text/xml text/css text/plain text/html application/x-javascript (toggled for the test by commenting it out)
Mod_cache was disabled throughout the test. Otherwise, the configuration was the CentOS default Apache configuration. The entire configuration is available in Appendix B.
The load testing software was Web Performance Load Tester version 3.5.6556, with the Server Agent of the same version installed on the test web server.
Hardware and OS
The target Apache web server was running on a Dell PowerEdge SC1420 with a 2.8GHz Xeon, family 15, model 4 – Pentium 4 architecture (Gallatin) with hyperthreading on, 1MB of L2 cache, 1GB of RAM, and an 800 MHz system bus. The target server was running CentOS 5.2, default server installation, updated to the current packages in the CentOS yum repositories as of 23 October 2008. The kernel was the default 2.6.18-92.1.13.el5 Linux kernel provided by the operating system.
Three of the load-generating engines were each running on a Dell Poweredge 300 with 2 x 800MHz Pentium III processors and 1GB of RAM. The fourth load-generating engine was running on a Pentium 4 2.4GHz with 2.25GB of RAM. Each engine was running the Web Performance Load Engine 3.5 Boot Disk version 3.5.6556.
The engines and server were networked via a Dell 2324 PowerConnect switch. The server was connected to a 1Gb port and the load engines were each connected to 100Mb ports.
The Load Tester GUI was running on Windows XP SP3, on a Dell Dimension DIM3000, Pentium 4 2.8Ghz with 1GB of RAM. This machine was connected to a 100Mb port on a second switch. This had no impact on the results, since the bandwidth requirements are small, but is mentioned here for completeness.
Test cases
The test cases chosen were five HTML files composed primarily of text, cut using the "dd" tool to 10KB, 25KB, 50KB, 75KB, and 100KB from the 336KB source file (an HTML version of Cory Doctorow's Down and Out in the Magic Kingdom). The load engines did not uncompress or parse this file in any way. There is a delay of one second between test cases. A book was chosen because the content was relatively homogenous across the entire file without being repetitive, as repetition of large amounts of data makes compression significantly easier.
These test cases are intended to simulate the compression of dynamically generated web pages where such pages are customized on an individual basis such that caching has little to no effect. It also represents static files that are repeatedly compressed without being cached, such as javascript or CSS files that are used on a homepage or across multiple pages, or a single static page that experiences a surge in web traffic.
We used static HTML files to avoid any CPU usage related to dynamic generation of the file, so that the impact of the compression could be clearly seen and not confused with other web server activity; a dynamically generated HTML page would add CPU usage to the load based on how difficult it was to generate. In short, these test cases represent the best-case bandwidth scenario, with 100% highly compressible content. They also represent a worst-case CPU scenario, with all content being compressed on the fly and no caching available to prevent repetitive compression. It is likely that a real-world website will have a lower CPU utilization due to some requested files not being run through the compression. This would also, however, reduce the bandwidth gains, since those files are often images or other static files that are already compressed and thus would use the same amount of bandwidth in either scenario.
The test data repository file is available for examination - the demo version of Load Tester can view the test cases, load configurations, detailed raw metrics, and the test reports. Test reports are also available in html, see Appendix C.
Load Configuration
- Five test cases as described above.
- Each test case is run independently, with the target web server being rebooted between each test.
- For each test case, there are two tests: a baseline without compression, and a test with compression enabled.
- Every VU (virtual user) runs the same test case, with a 1-second delay in between.
- Each VU is simulating a 5Mbps connection (e.g. cable/DSL connection).
- Each test is 8 minutes in length, starting with 50 users and increasing the load by 50 users every two minutes at random intervals inside that minute, to a maximum load of 200 users.
- 5 second sample period.
- The test parameters were determined through a number of preliminary tests that gauged the performance that each load engine was capable of, and to make sure that consumed bandwidth was not high enough to impact the test.
Test Procedure
Each test run followed these steps:
- Turn compression in the Apache configuration on or off, depending on the test
- Restart the server
- Start the server monitoring agent
- Run the load test
Appendix B: Apache Configuration
Appendix C: Referenceshttp://developer.yahoo.net/blog/archives/2007/07/high_performanc_3.html
Related Articles
http://www.w3.org/2008/06/gzip-mobile/results.php
http://www.linuxjournal.com/article/6802
Full Test Reports
10KB | clean | compressed |
25KB | clean | compressed |
50KB | clean | compressed |
75KB | clean | compressed |
100KB | clean | compressed |
Feedback & Comments
Comments about this report may be posted at the company blog post.
Version History
v1.0 - 1st public release (4 Dec 08)
v1.1 - email cleanup (23 Jan 09)
发表评论
-
sysctl.conf
2011-07-06 14:54 1754fs.file-max=51200 net.core.net ... -
top的替代工具
2011-06-28 15:06 1470dstat -cgilpymn collectl and ... -
有用的小工具
2010-12-23 11:51 1349pv stream nessus Nikto ski ... -
调优linux i/o 行为
2010-11-25 11:27 2918http://www.westnet.com/~gsmith/ ... -
服务器部署工具
2010-11-12 16:32 2057http://www.linuxlinks.com/artic ... -
开源的配置管理工具
2010-11-12 16:24 1473最佳开源配置管理工具: Puppet / 提名:OpenQ ... -
优化ext3的mount选项
2010-11-12 10:24 1351defaults,commit=600,noatime,nod ... -
恢复r710biso 出厂设置
2010-11-10 10:30 1222ALT+E/F/B -
每进程io监控工具
2010-11-02 14:14 1662iodump iotop iopp pidstat b ... -
Intel Xeon 5500/5600系列 CPU服务器内存设置
2010-11-01 21:29 4846http://www.xasun.com/article/2a ... -
zabbix短信报警脚本文件
2010-10-21 14:28 2790附件 -
天外飞仙级别的Linux Shell命令
2010-10-16 09:59 1447本文编译自commandlinefu.com ( 应该是 Ca ... -
lenny+r710+lvm 重启问题解决方案
2010-10-15 14:22 1127ro rootdelay=10 quiet -
fai,debian 自动安装工具
2010-10-15 13:36 1116http://sys.firnow.com/linux/x80 ... -
十个服务器监控工具
2010-09-26 11:44 1831一位国外的技术博主在 ... -
restrict authorized_keys
2010-09-06 09:45 1264command="/home/someuser/rs ... -
sysctl优化设置
2010-09-05 11:25 1142sysctl 是一个用来在系统运作中查看及调整系统参数的工 ... -
proc文件系统
2010-09-05 11:22 1257什么是proc文件系统? proc文件系统是一个伪 ... -
nfs使用
2010-09-02 17:01 1154http://www.linuxhomenetworking. ... -
lsof example
2010-08-23 12:40 12721、查看文件系统阻塞 ...
相关推荐
「安全建设」Measuring_the_Rationality_of_Security_Behavior - 应用安全 安全集成 Android 防火墙 安全培训 日志审计
标题提到的"Measuring the Rationality of Security Behavior"研究关注的是用户在面对安全决策时的行为模式,尤其是在数字化环境中。描述中的“边界防御”、“安全管控”、“自动化”和“企业安全”都是构成信息安全...
标题“320-0319A Measuring a P410 UWB Waveform_Measuring_algorithm_uwb_”涉及的主题是关于测量Pulson P410超宽带(UWB)波形的技术和算法。UWB技术是一种无线通信方式,通过发送极短的脉冲来传输数据,具有低...
BS EN 60436-2020 Electric dishwashers for household use - Methods for measuring the performance.pdf
在标题“Displacement_measuring_device.rar_Measuring_displacement_位移_位移检测”中,我们可以推断出这个压缩包文件包含了一个关于位移测量设备的详细资料,可能是设计原理、实现方法或是实际应用案例。...
《3-Measuring the Objectness of Image Windows》这篇论文是目标检测领域的重要研究,它引入了"对象性"(Objectness)的概念,为图像窗口的预筛选提供了有效的度量标准,从而大大提高了目标检测的效率。这篇论文的...
本项目“MCU_measuring_motor”旨在使用51单片机来测量电机的转速,通过计数脉冲技术,将测量结果实时显示在汉字液晶屏上。脉冲信号是由光电编码器或扫描码盘产生的,这类设备能够转换机械位置为电信号,非常适合...
The Sharpe ratio (Sharpe 1992) is one industry standard for measuring the absolute risk adjusted performance of hedge funds. This function performs the testing of Sharpe ratio difference for two funds...
《测量计算机性能:实践者指南》(Measuring Computer Performance A Practitioner's Guide)由David J. Lilja撰写,该书全面介绍了分析和理解计算机系统性能的基本技术。本书的重点在于实际可行的方法,包括测量、...
The Development of mod_perl 1.0 Section 1.4. Apache 1.3 Request Processing Phases Section 1.5. References Chapter 2. Getting Started Fast Section 2.1. Installing mod_perl 1.0...
a_device_for_EEG_and_ECG_measuring_based_on_ADS129_PhysicalSignalAcquisitonDevice
《Measuring The User Experience》由Thomas S. (Tom) Tullis和William (Bill) Albert共同撰写,本书旨在为读者提供全面深入的理解关于如何衡量和优化用户体验的方法论。作者们不仅在理论上进行了探讨,而且结合了...
### 效率测量在突尼斯学校中的应用:融合准固定投入的数据包络分析与自举法 #### 经济学教育研究背景 经济学教育领域的研究着重于如何通过经济原理来评估教育资源分配的有效性和效率。本研究关注的是突尼斯高中...
"measuring-performance.zip"文件很可能包含了关于如何度量和提升Python与OpenCV结合使用时的代码执行效率的相关资料。在这个场景下,我们将探讨几个关键的知识点: 1. **Python性能分析**:Python虽然易读易写,但...
在IT领域,特别是无线通信和定位技术中,TDOA(Time Difference of Arrival,到达时间差)是一种常用的方法,用于确定目标的位置。本文件“LBL-TDOA-time measuring error simulation.rar”显然关注的是长基线定位...
Pan,孙长庆,Surface oxidation and porosity variation play significant roles in the dielectric performance of porous silicon (PS) yet discriminating the contribution of these events is a challe
A signal is being quantized and SNR is being measured.
《测量用户体验》一书由Thomas S. Tullis与William Albert共同撰写,是人机交互设计领域的经典之作,被广泛推荐为国外大学的必备教材。本书深入探讨了用户体验(User Experience,简称UX)的衡量方法,阐述了如何...