- 浏览: 143292 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
nwater:
楼主,我在使用gecco核心库开发时,出现下面的错误,请问这种 ...
gecco 1.1.0稳定版发布,易用的轻量化爬虫 -
xtuhcy:
Chen.H 写道结合reids的插件gecco-reids ...
java爬虫gecco监控来了,不再裸奔 -
Chen.H:
结合reids的插件gecco-reids reids=&g ...
java爬虫gecco监控来了,不再裸奔 -
xtuhcy:
gnomewarlock 写道这例子能跑? 少了个最主要的wr ...
maven打包deamon运行程序 -
gnomewarlock:
这例子能跑? 少了个最主要的wrapperMainClass
maven打包deamon运行程序
http协议里控制浏览器缓存的头有三个Cache-Control,Expires,Last-Modified
对于静态页面还有Etag。
一、先来看第一种情况:apache 静态页面
apache发送给客户端的静态页面一般包含Last-Modified和Etag,这两个标签的值来自静态文件的修改时间和inode,
下面是截取得apache返回客户端的头
---------
Last-Modified: Fri, 26 Jan 2007 01:53:34 GMT
ETag: "3f9f640-318-cb9f8380"
---------
搜索引擎之所以喜欢静态文件是因为有这两个标识,可以判断文件是否更新过
二、PHP等动态页面
由于php是动态生成的,它的内容是不能根据php程序的时间来确定最后修改日期,所以默认php返回客户端的时候补包含任何缓存控制,要想利用好缓存就必须了解缓存机制,和理减少b,s的交互,缩减带宽流量,减轻服务器负担...好处多多
三、缓存控制的具体含义
先解释一下本人经过测试理解的这几个标签的含义
Cache-Control:指定请求和响应遵循的缓存机制。在请求消息或响应消息中设置Cache-Control并不会修改另一个消息处理过程中的缓 存处理过程。请求时的缓存指令包括no-cache、no-store、max-age、max-stale、min-fresh、only-if- cached,响应消息中的指令包括public、private、no-cache、no-store、no-transform、must- revalidate、proxy-revalidate、max-age。各个消息中的指令含义如下:
Public指示响应可被任何缓存区缓存。
Private指示对于单个用户的整个或部分响应消息,不能被共享缓存处理。这允许服务器仅仅描述当用户的部分响应消息,此响应消息对于其他用户的请求无效。
no-cache指示请求或响应消息不能缓存
no-store用于防止重要的信息被无意的发布。在请求消息中发送将使得请求和响应消息都不使用缓存。
max-age指示客户机可以接收生存期不大于指定时间(以秒为单位)的响应。
min-fresh指示客户机可以接收响应时间小于当前时间加上指定时间的响应。
max-stale指示客户机可以接收超出超时期间的响应消息。如果指定max-stale消息的值,那么客户机可以接收超出超时期指定值之内的响应消息。
php用法:
在输出之前用header(),(如果使用ob_start()可以将header放在程序任意地方)
header('Cache-Control: max-age=8');
max-age=8表示最大生存期8秒,超过8秒浏览器必须去服务器重新读取,这个时间是以用户的读取页面开始计时的,而Expires是绝对时间。
Expires:缓存过期的绝对时间,如果过了它指定的那个时间点,浏览器就不认缓存了,要去服务器重新请求一份最新的。
Last-Modified:文档的最后修改时间,它的妙用就是:1 如果是静态文件,客户端会发上来它缓存里的时间,apache会来比对,如果发现没有修改就直接返回一个头,状态码是304,字节数非常少,(高级版本还会增加比较Etag来确定文件是否变化)
2 php动态文件: 客户端发上比对时间,php会判断是否修改,如果修改时间相同,就只会返回1024字节,至于为什么返回1024不得而知,如果你 的php生成的文件非常大,它也只返回1024,所以比较省带宽,客户端会根据服务器端发过来的修改时间自动从缓存文件里显示。
注:如果没有Last-Modified头,Cache-Control和Expires也是可以起作用的,但每次请求要返回真实的文件字节数,而不是1024
四、HOW ?
静态页面不用去管它了,如果想更好的控制静态页面的缓存,apache有几个模块可以很好的控制,这里不讨论
php页面:
这里分两种:1 不经常改动的页面,类似新闻发布,这类页面的特点:第一次发布之后会有几次改动,随着时间推移基本不会再修改。控制策略应该是:1第一次 发布之发送Last-Modified,max-age设定1天,修改过之后更新Last-Modified,max-age时间随着修改次数正常。这样 似乎比较繁琐,还要记录修改次数,也可以预计一下下次可能的修改时间用Expires指定到大概时间过期
php代码:
//header('Cache-Control: max-age=86400');//缓存一天
header('Expires: Mon, 29 Jan 2007 08:56:01 GMT');//指定过期时间
header('Last-Modified: '.gmdate('D, d M Y 01:01:01',$time).' GMT');//格林尼治时间,$time是文件添加时候的时间戳
2 经常改动的页面 类似bbs,论坛程序,这种页面更新速度比较快,缓存的主要作用是防止用户频繁刷新列表,导致服务器数据库负担,既要保证更新的及时性,也要保证缓存能被利用
这里一般用Cache-Control来控制,根据论坛的发帖的频率灵活控制max-age。
header('Cache-Control: max-age=60');//缓存一分钟
header('Last-Modified: '.gmdate('D, d M Y 01:01:01',$time).' GMT');//格林尼治时间,$time是帖子的最后更新时间戳
五 额外
1 刷新,转到,强制刷新的区别
浏览器上有刷新和转到按键,有的浏览器支持用ctrl+F5强制刷新页面,它们的区别是什么?
转到:用户点击链接就是转到,它完全使用缓存机制,如果有Last-Modified那么不会和服务器通讯,用抓包工具可以查看到发送字节是0byte,如果缓存过期,那么它会执行F5刷新的动作。
刷新(F5):这种刷新也是根据缓存是否有Last-Modified来决定,如果有会转入304或1024(php),如果没有最后更新时间那么去服务器读取,返回真实文档大小
强制刷新:完全抛弃缓存机制,去服务器读取最新文档,向服务器发送的header如下
Cache-Control: no-cache
2 调试工具
查看浏览器和服务器交互比较好的工具是httpwatch pro,现在的版本4.1,支持ie7
还有别的代理抓包工具可以分析,http debugging。没用过,还有tcp抓包工具,2000自带的network monitor不过不是专门针对http的比较难用
六 声明
本文作者保留所有权力,允许被自由查看和转载,但必须指明作者(Ash)和源网址(www.cosrc.com);不允许商用
下面是HTTP 协议 E文原版
Cache-Control
The general-header field "Cache-Control" is used to specify
directives that MUST be obeyed by all caches along the request/
response chain. The directives specify behavior intended to prevent
caches from adversely interfering with the request or response.
Cache directives are unidirectional in that the presence of a
directive in a request does not imply that the same directive is to
be given in the response.
Note that HTTP/1.0 caches might not implement Cache-Control and
might only implement Pragma: no-cache (see Section 3.4).
Cache directives MUST be passed through by a proxy or gateway
application, regardless of their significance to that application,
since the directives might be applicable to all recipients along the
request/response chain. It is not possible to target a directive to
a specific cache.
Fielding, et al. Expires September 10, 2009 [Page 17]
Internet-Draft HTTP/1.1, Part 6 March 2009
Cache-Control = "Cache-Control" ":" OWS Cache-Control-v
Cache-Control-v = 1#cache-directive
cache-directive = cache-request-directive
/ cache-response-directive
cache-extension = token [ "=" ( token / quoted-string ) ]
3.2.1. Request Cache-Control Directives
cache-request-directive =
"no-cache"
/ "no-store"
/ "max-age" "=" delta-seconds
/ "max-stale" [ "=" delta-seconds ]
/ "min-fresh" "=" delta-seconds
/ "no-transform"
/ "only-if-cached"
/ cache-extension
no-cache
The no-cache request directive indicates that a stored response
MUST NOT be used to satisfy the request without successful
validation on the origin server.
no-store
The no-store request directive indicates that a cache MUST NOT
store any part of either this request or any response to it. This
directive applies to both non-shared and shared caches. "MUST NOT
store" in this context means that the cache MUST NOT intentionally
store the information in non-volatile storage, and MUST make a
best-effort attempt to remove the information from volatile
storage as promptly as possible after forwarding it.
This directive is NOT a reliable or sufficient mechanism for
ensuring privacy. In particular, malicious or compromised caches
might not recognize or obey this directive, and communications
networks may be vulnerable to eavesdropping.
max-age
The max-age request directive indicates that the client is willing
to accept a response whose age is no greater than the specified
time in seconds. Unless max-stale directive is also included, the
client is not willing to accept a stale response.
Fielding, et al. Expires September 10, 2009 [Page 18]
Internet-Draft HTTP/1.1, Part 6 March 2009
max-stale
The max-stale request directive indicates that the client is
willing to accept a response that has exceeded its expiration
time. If max-stale is assigned a value, then the client is
willing to accept a response that has exceeded its expiration time
by no more than the specified number of seconds. If no value is
assigned to max-stale, then the client is willing to accept a
stale response of any age. [[anchor15: of any staleness? --mnot]]
min-fresh
The min-fresh request directive indicates that the client is
willing to accept a response whose freshness lifetime is no less
than its current age plus the specified time in seconds. That is,
the client wants a response that will still be fresh for at least
the specified number of seconds.
no-transform
The no-transform request directive indicates that an intermediate
cache or proxy MUST NOT change the Content-Encoding, Content-Range
or Content-Type request headers, nor the request entity-body.
only-if-cached
The only-if-cached request directive indicates that the client
only wishes to return a stored response. If it receives this
directive, a cache SHOULD either respond using a stored response
that is consistent with the other constraints of the request, or
respond with a 504 (Gateway Timeout) status. If a group of caches
is being operated as a unified system with good internal
connectivity, such a request MAY be forwarded within that group of
caches.
Fielding, et al. Expires September 10, 2009 [Page 19]
Internet-Draft HTTP/1.1, Part 6 March 2009
3.2.2. Response Cache-Control Directives
cache-response-directive =
"public"
/ "private" [ "=" DQUOTE 1#field-name DQUOTE ]
/ "no-cache" [ "=" DQUOTE 1#field-name DQUOTE ]
/ "no-store"
/ "no-transform"
/ "must-revalidate"
/ "proxy-revalidate"
/ "max-age" "=" delta-seconds
/ "s-maxage" "=" delta-seconds
/ cache-extension
public
The public response directive indicates that the response MAY be
cached, even if it would normally be non-cacheable or cacheable
only within a non-shared cache. (See also Authorization, Section
3.1 of [Part7], for additional details.)
private
The private response directive indicates that the response message
is intended for a single user and MUST NOT be stored by a shared
cache. A private (non-shared) cache MAY store the response.
If the private response directive specifies one or more field-
names, this requirement is limited to the field-values associated
with the listed response headers. That is, the specified field-
names(s) MUST NOT be stored by a shared cache, whereas the
remainder of the response message MAY be.
Note: This usage of the word private only controls where the
response may be stored, and cannot ensure the privacy of the
message content.
no-cache
The no-cache response directive indicates that the response MUST
NOT be used to satisfy a subsequent request without successful
validation on the origin server. This allows an origin server to
prevent caching even by caches that have been configured to return
stale responses.
If the no-cache response directive specifies one or more field-
names, this requirement is limited to the field-values assosicated
with the listed response headers. That is, the specified field-
Fielding, et al. Expires September 10, 2009 [Page 20]
Internet-Draft HTTP/1.1, Part 6 March 2009
name(s) MUST NOT be sent in the response to a subsequent request
without successful validation on the origin server. This allows
an origin server to prevent the re-use of certain header fields in
a response, while still allowing caching of the rest of the
response.
Note: Most HTTP/1.0 caches will not recognize or obey this
directive.
no-store
The no-store response directive indicates that a cache MUST NOT
store any part of either the immediate request or response. This
directive applies to both non-shared and shared caches. "MUST NOT
store" in this context means that the cache MUST NOT intentionally
store the information in non-volatile storage, and MUST make a
best-effort attempt to remove the information from volatile
storage as promptly as possible after forwarding it.
This directive is NOT a reliable or sufficient mechanism for
ensuring privacy. In particular, malicious or compromised caches
might not recognize or obey this directive, and communications
networks may be vulnerable to eavesdropping.
must-revalidate
The must-revalidate response directive indicates that once it has
become stale, the response MUST NOT be used to satisfy subsequent
requests without successful validation on the origin server.
The must-revalidate directive is necessary to support reliable
operation for certain protocol features. In all circumstances an
HTTP/1.1 cache MUST obey the must-revalidate directive; in
particular, if the cache cannot reach the origin server for any
reason, it MUST generate a 504 (Gateway Timeout) response.
Servers SHOULD send the must-revalidate directive if and only if
failure to validate a request on the entity could result in
incorrect operation, such as a silently unexecuted financial
transaction.
proxy-revalidate
The proxy-revalidate response directive has the same meaning as
the must-revalidate response directive, except that it does not
apply to non-shared caches.
max-age
Fielding, et al. Expires September 10, 2009 [Page 21]
Internet-Draft HTTP/1.1, Part 6 March 2009
The max-age response directive indicates that response is to be
considered stale after its age is greater than the specified
number of seconds.
s-maxage
The s-maxage response directive indicates that, in shared caches,
the maximum age specified by this directive overrides the maximum
age specified by either the max-age directive or the Expires
header. The s-maxage directive also implies the semantics of the
proxy-revalidate response directive.
no-transform
The no-transform response directive indicates that an intermediate
cache or proxy MUST NOT change the Content-Encoding, Content-Range
or Content-Type response headers, nor the response entity-body.
3.2.3. Cache Control Extensions
The Cache-Control header field can be extended through the use of one
or more cache-extension tokens, each with an optional value.
Informational extensions (those that do not require a change in cache
behavior) can be added without changing the semantics of other
directives. Behavioral extensions are designed to work by acting as
modifiers to the existing base of cache directives. Both the new
directive and the standard directive are supplied, such that
applications that do not understand the new directive will default to
the behavior specified by the standard directive, and those that
understand the new directive will recognize it as modifying the
requirements associated with the standard directive. In this way,
extensions to the cache-control directives can be made without
requiring changes to the base protocol.
This extension mechanism depends on an HTTP cache obeying all of the
cache-control directives defined for its native HTTP-version, obeying
certain extensions, and ignoring all directives that it does not
understand.
For example, consider a hypothetical new response directive called
"community" that acts as a modifier to the private directive. We
define this new directive to mean that, in addition to any non-shared
cache, any cache that is shared only by members of the community
named within its value may cache the response. An origin server
wishing to allow the UCI community to use an otherwise private
response in their shared cache(s) could do so by including
Cache-Control: private, community="UCI"
Fielding, et al. Expires September 10, 2009 [Page 22]
Internet-Draft HTTP/1.1, Part 6 March 2009
A cache seeing this header field will act correctly even if the cache
does not understand the community cache-extension, since it will also
see and understand the private directive and thus default to the safe
behavior.
Unrecognized cache directives MUST be ignored; it is assumed that any
cache directive likely to be unrecognized by an HTTP/1.1 cache will
be combined with standard directives (or the response's default
cacheability) such that the cache behavior will remain minimally
correct even if the cache does not understand the extension(s).
3.3. Expires
The entity-header field "Expires" gives the date/time after which the
response is considered stale. See Section 2.3 for further discussion
of the freshness model.
The presence of an Expires field does not imply that the original
resource will change or cease to exist at, before, or after that
time.
The field-value is an absolute date and time as defined by HTTP-date
in Section 3.2.1 of [Part1]; it MUST be sent in rfc1123-date format.
Expires = "Expires" ":" OWS Expires-v
Expires-v = HTTP-date
For example
Expires: Thu, 01 Dec 1994 16:00:00 GMT
Note: if a response includes a Cache-Control field with the max-
age directive (see Section 3.2.2), that directive overrides the
Expires field. Likewise, the s-maxage directive overrides Expires
in shared caches.
HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in
the future.
HTTP/1.1 clients and caches MUST treat other invalid date formats,
especially including the value "0", as in the past (i.e., "already
expired").
http://www.ietf.org/internet-drafts/draft-ietf-httpbis-p6-cache-06.txt
发表评论
-
使用Gecco主题爬虫爬取旅游折扣信息
2016-02-04 17:31 3181Gecco爬虫已经开发有一个多月了,爬虫的 ... -
nginx在centos下的安装
2015-07-06 20:17 729wget http://nginx.org/download ... -
eclipse github(转)
2013-05-22 16:45 767http://www.pigg.co/eclipse-and- ... -
好文推荐-hashmap
2012-09-06 21:10 891http://grunt1223.iteye.com/blog ... -
Android上成功实现了蓝牙的一些Profile
2011-09-15 14:19 1433http://blog.csdn.net/haojunming ... -
安徽最牛零分作文出炉(转)
2011-09-02 17:03 260安徽最牛零分作文出炉 ...
相关推荐
**ESB(Enterprise Service Bus,企业服务总线)与Cache详解** **一、ESB概念与作用** ESB是企业级应用集成中的一个重要组件,它作为一个中间件平台,旨在简化和标准化不同系统间的通信。ESB的核心功能包括消息传递...
ARM MMU和Cache是嵌入式系统中两个关键的组件,它们对于理解现代微处理器的工作原理至关重要。在本文中,我们将深入探讨这两个概念以及它们如何协同工作以优化系统的性能和安全性。 1. **MMU(内存管理单元)** ...
### 实验五:虚拟Cache与伪相联Cache #### 实验背景与目标 ...通过本实验,不仅能够直观地理解这些复杂的缓存概念和技术,还能够在实践中掌握它们的应用技巧,这对于学习计算机系统的高级架构设计至关重要。
学习价值 通过构建Cache模拟器,开发者可以深入理解Cache的工作机制,包括地址映射、替换策略等核心概念,这对于优化程序性能、解决内存瓶颈问题具有实际意义。同时,该实践项目也能锻炼编程能力,提高对数据结构...
《BaiduMusic Cache源码解析与应用探讨》 在当今数字化音乐时代,音乐播放软件扮演着重要的角色,其中百度音乐以其丰富的曲库和便捷的用户体验备受用户喜爱。针对这款应用,有开发者研究并公开了“BaiduMusic Cache...
MIPS架构下的Cache管理与操作是嵌入式系统和计算机硬件设计中的重要组成部分。MIPS(Microprocessor without Interlocked ...通过学习这些知识,开发者能够更好地理解和调试MIPS系统的缓存行为,提升系统整体效率。
首先,Logisim是一款开源的数字电路设计和仿真软件,特别适合用于教育和学习目的。Logisim-Evolution是Logisim的Google版本,它增加了许多新功能和改进,使用户能够更方便地构建和理解数字系统,包括复杂的缓存系统...
在本项目"spring-boot-mybatis-cache-thymeleaf学习练习demo源码"中,我们可以深入学习和实践如何将Spring Boot、MyBatis、Cache(通常指的是Spring Cache)以及Thymeleaf这四个关键组件整合在一起,创建一个高效、...
通过学习本手册提供的资料,开发者可以深入理解Cache工作原理,掌握STM32的Cache配置和管理,进一步优化嵌入式系统的性能。配合相关博客进行实践,将理论知识转化为实际技能,为提升系统效率打下坚实基础。
当系统中的某个设备或进程需要CPU立即响应时,它会发送一个中断信号给CPU,使得CPU暂停当前正在执行的任务,保存上下文,然后转去处理中断请求。处理完后,CPU再恢复之前的状态,继续执行原任务。中断对于实时性和多...
cache.rar"包含了OpenCV 4.1版本开发所需的3rdparty库的缓存文件,包括依赖管理工具Ade,面部特征检测模型,多媒体处理框架FFmpeg,Intel的图像处理库IPPICV,Boost库实现的特征描述符,以及VGG深度学习模型。...
通过阅读和学习这些源文件,开发者可以深入了解其内部工作机制,以便更好地将其集成到自己的项目中,或者根据特定需求进行二次开发。 在实际应用中,XmlCache可能用于存储经常查询但不常更改的数据,如配置信息、...
通过学习和操作这个实验,我们可以掌握如何设计和优化存储层次结构,这对于理解和改进计算机系统的性能至关重要。无论是硬件设计者还是软件开发者,对这些概念的理解都是必不可少的。通过VirtualMem和Cache这两个子...
**RM9200 ARM CPU与CP15、MMU及CACHE详解** 本文将深入探讨RM9200微处理器中的...通过深入学习提供的文档“AT91RM9200 ARM Xmodem CP15 MMU CACHE.doc”,我们可以更全面地了解这些概念,提升在嵌入式领域的专业技能。
总的来说,实现Cache-主存层次上的命中率计算是一项涉及计算机体系结构、存储层次和性能优化的实践任务,它不仅有助于理论学习,也能培养实际操作和问题解决的能力。通过模拟器,我们可以模拟各种场景,探究更高效的...
内存系统分为几个层次,包括高速缓存(Cache)、动态随机存取存储器(DRAM)和磁盘存储。每层都有其独特的特性和功能,并且在整个存储体系中扮演着各自的角色。 缓存是计算机内存系统中最接近CPU的部分,它用于临时...
总之,这套"一套cache的Verilog HDL代码"提供了一个学习和实践Cache设计的实例,涵盖了Cache设计的核心要素,对于理解计算机系统架构和硬件描述语言的应用具有很高的价值。通过分析和修改这些代码,开发者可以深入...
【Cache和虚拟存储】是计算机系统中用于提升数据访问速度和效率的重要技术。Cache是一种高速缓冲存储器,位于CPU和主内存之间,用于存放最近频繁访问的...通过学习和实践,我们可以更好地设计和优化计算系统的性能。
通过深入学习《Jacob, Ng, Wang - Memory Systems: Cache, DRAM, Disk》,读者将能够全面理解现代计算机存储系统的设计原理和优化方法,这对于系统架构师、软件开发者和硬件工程师来说都是至关重要的知识。
在计算机科学领域,Cache存储器是提升处理器性能的关键技术之一。Cache的工作原理是通过将频繁访问的数据临时存储在...对于学习计算机体系结构和嵌入式系统设计的学生,这样的动态演示程序无疑是一个宝贵的辅助工具。