文件缓存:查询字符串与上次修改? [英] File Caching: Query string vs Last-Modified?

查看:25
本文介绍了文件缓存:查询字符串与上次修改?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试缓存我网站资产的方法,并注意到大多数与我的网站类似的网站都使用查询字符串来覆盖缓存(例如:/css/style.css?v=124942823)

之后,我注意到每当我保存我的 style.css 文件时,最后修改的标题都会被更新",使得查询字符串变得不必要.

所以我想知道:

  • 为什么这么多网站都使用查询字符串"方法,而不是让最后修改的标头发挥作用?
  • 我是否应该取消设置 Last-modified 标头并只处理查询字符串?(这有什么特别的好处吗?)

解决方案

TL;DR

<块引用>

为什么这么多网站都使用查询字符串"方法,而不是让最后修改的标头发挥作用?

更改查询字符串会更改 url,确保内容新鲜".

<块引用>

我是否应该取消设置 Last-modified 标头并只处理查询字符串?

没有.虽然这几乎是正确的答案.

<小时>

网络上使用了三种基本的缓存策略:

  • 没有缓存,或缓存被禁用
  • 使用验证/条件请求
  • 永久缓存

为了说明这三种情况,请考虑以下场景:

用户第一次访问网站,加载了十个页面并离开.每个页面加载相同的 css 文件.对于上述每种缓存策略,会发出多少请求?

无缓存:10 个请求

在这种情况下,应该清楚没有其他任何影响结果的因素,对 css 文件的 10 次请求将导致它被发送到客户端(浏览器)10 次.

优势

  • 内容永远新鲜
  • 无需努力/管理

缺点

  • 效率最低,内容总是传输

验证请求:10 个请求

如果 .如果是这种情况,对于后续请求,客户端将从它自己的缓存中读取内容,而根本不与远程服务器通信.

这具有明显的性能优势,这在延迟可能很大(委婉地说)的移动设备上尤其显着.

优势

  • 效率最高,内容只传输一次

缺点

  • 网址必须更改以防止现有访问者加载过时的缓存版本
  • 最努力的设置/管理

不要使用查询字符串来清除缓存

站点使用查询参数是为了绕过客户端的缓存.当内容更改时(或发布新版本的站点),查询参数会被修改,因此当 url 更改时,将请求该文件的 版本.这比每次更改文件时重命名文件更少工作/更方便,但并非没有问题,

使用查询字符串可防止代理缓存,在下面的引用中,作者证明来自浏览器<->代理缓存服务器<->网站的请求不使用代理缓存:

<块引用>

加载 mylogo.gif?v=1.2 两次(清除中间缓存)结果在这些标题中:

<代码>>>获取 http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1<<HTTP/1.0 200 正常<<日期:2008 年 8 月 23 日星期六 00:19:34 GMT<<到期时间:2018 年 8 月 21 日,星期二 00:19:34 GMT<<X-Cache:来自 someserver.com 的 MISS<<X-Cache-Lookup:来自 someserver.com 的 MISS>>获取 http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1<<HTTP/1.0 200 正常<<日期:2008 年 8 月 23 日星期六 00:19:47 GMT<<到期时间:2018 年 8 月 21 日,星期二 00:19:47 GMT<<X-Cache:来自 someserver.com 的 MISS<<X-Cache-Lookup:来自 someserver.com 的 MISS

很明显,第二个响应不是由代理提供的:缓存响应头说 MISS,Date 和 Expires 值改变,跟踪 stevesouders.com 访问日志显示两次点击.

这不应该掉以轻心 - 访问物理上位于世界另一端的网站时,响应时间可能非常缓慢.从位于路由沿线的代理服务器获取答案可能意味着网站可用与否之间的区别 - 在永久缓存资源的情况下,这意味着 url 的第一次加载很慢,在使用验证请求的情况下意味着整个网站将变得缓慢.

改为版本控制资产

最佳"解决方案是版本控制文件,以便每当内容更改时 url 也会更改.通常,这会作为构建过程的一部分自动执行.

然而,近乎妥协的是实现重写规则

# ----------------------------------------------------------------# |基于文件名的缓存破坏 |# ------------------------------------------------------------------------------# 如果你没有使用构建过程来管理你的文件名版本,# 您可能需要考虑启用以下指令来路由所有# 诸如`/css/style.12345.css`之类的请求到`/css/style.css`.# 要理解为什么这比 `*.css?v231` 更重要和更好,请阅读:# http://stevesouders.com/blog/2008/08/23/revving-filenames-dont-use-querystring<IfModule mod_rewrite.c>RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^(.+)\.(\d+)\.(js|css|png|jpe?g|gif)$ $1.$3 [L]</IfModule>

这样,对 foo.123.css 的请求由服务器处理为 foo.css - 这具有使用查询参数进行缓存的所有优点破坏,但没有禁用代理缓存的问题.

I was fooling around with ways of caching my website's assets and noticed most websites similar to mine use query strings to override caching (e.g.: /css/style.css?v=124942823)

Afterwards, I noticed that whenever I saved my style.css file, the last-modified headers were "updated", making the query string unnecessary.

So I wonder:

  • Why do so many websites use the "query string" method, instead of just letting the last-modified header do its work?
  • Should I unset the Last-modified header and just work with query strings? (Is there any particular advantage to this?)

解决方案

TL;DR

Why do so many websites use the "query string" method, instead of just letting the last-modified header do its work?

Changing the query string changes the url, ensuring content is "fresh".

Should I unset the Last-modified header and just work with query strings?

No. Though that's almost the right answer.


There are three basic caching strategies used on the web:

  • No caching, or caching disabled
  • Using validation/conditional requests
  • Caching forever

To illustrate all three, consider the following scenario:

A user accesses a website for the first time, loads ten pages and leaves. Each page loads the same css file. For each of the above caching strategies how many requests would be made?

No caching: 10 requests

In this scenario, it should be clear that there isn't anything else influencing the result, 10 requests for the css file would result in it being sent to the client (browser) 10 times.

Advantages

  • Content always fresh
  • No effort/management required

Disadvantages

  • Least efficient, content always transferred

Validation requests: 10 requests

If Last-Modified or Etag are used, there will also be 10 requests. However 9 of them will only be the headers, and no body is transferred. Clients use conditional requests to avoid re-downloading something it already has. Take for example the css file for this site.

The very first time the file is requested, the following happens:

$ curl -i http://cdn.sstatic.net/stackoverflow/all.css
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Mon, 12 May 2014 07:38:31 GMT
Content-Type: text/css
Connection: keep-alive
Set-Cookie: __cfduid=d3fa9eddf76d614f83603a42f3e552f961399880311549; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.sstatic.net; HttpOnly
Cache-Control: public, max-age=604800
Last-Modified: Wed, 30 Apr 2014 22:09:37 GMT
ETag: "8026e7dfc064cf1:0"
Vary: Accept-Encoding
CF-Cache-Status: HIT
Expires: Mon, 19 May 2014 07:38:31 GMT
CF-RAY: 1294f50b2d6b08de-CDG
.avatar-change:hover{backgro.....Some KB of content

A subsequent request for the same url would look like this:

$ curl -i -H "If-Modified-Since:Wed, 30 Apr 2014 22:09:37 GMT" http://cdn.sstatic.net/stackoverflow/all.css
HTTP/1.1 304 Not Modified
Server: cloudflare-nginx
Date: Mon, 12 May 2014 07:40:11 GMT
Content-Type: text/css
Connection: keep-alive
Set-Cookie: __cfduid=d0cc5afd385060dd8ba26265f0ebf40f81399880411024; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.sstatic.net; HttpOnly
Cache-Control: public, max-age=604800
Last-Modified: Wed, 30 Apr 2014 22:09:37 GMT
ETag: "8026e7dfc064cf1:0"
Vary: Accept-Encoding
CF-Cache-Status: HIT
Expires: Mon, 19 May 2014 07:40:11 GMT
CF-RAY: 1294f778e75d04a3-CDG

Note there is no body, and the response is a 304 Not Modified. This is telling the client that the content it already has (in local cache) for that url is still fresh.

That's not to say this is the optimal scenario. Using tools such as the network tab of chrome developer tools allows you to see exactly how long, and doing what, a request takes:

Because the response has no body, the response time will be much less because there's less data to transfer. But there is still a response. and there is still all of the overhead of connecting to the remote server.

Advantages

  • Content always fresh
  • Only one "Full" request sent
  • Nine requests are much slimmer only containing headers
  • More efficient

Disadvantages

  • Still issues the maximum number of requests
  • Still incurs DNS lookups
  • Still needs to establish a connection to the remote server
  • Doesn't work offline
  • May require server configuration

Caching forever: 1 request

If there are no etags, no last modified header and only an expires header set far in the future - only the very first access to a url will result in any communication with the remote server. This is a well-known? best practice for better frontend performance. If this is the case, for subsequent requests a client will read the content from it's own cache and not communicate with the remote server at all.

This has clear performance advantages, which are especially significant on mobile devices where latency can be significant (to put it mildly).

Advantages

  • Most efficient, content only transferred once

Disadvantages

  • The url must change to prevent existing visitors loading stale cached versions
  • Most effort to setup/manage

Don't use query strings for cache busting

It is to circumvent a client's cache that sites use a query argument. When the content changes (or if a new version of the site is published) the query argument is modified, and therefore a new version of that file will be requested as the url has changed. This is less work/more convenient than renaming the file every time it changes, it is not however without its problems,

Using query strings prevents proxy caching, in the below quote the author is demonstating that a request from browser<->proxy cache server<->website does not use the proxy cache:

Loading mylogo.gif?v=1.2 twice (clearing the cache in between) results in these headers:

>> GET http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1
<< HTTP/1.0 200 OK
<< Date: Sat, 23 Aug 2008 00:19:34 GMT
<< Expires: Tue, 21 Aug 2018 00:19:34 GMT
<< X-Cache: MISS from someserver.com
<< X-Cache-Lookup: MISS from someserver.com

>> GET http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1
<< HTTP/1.0 200 OK
<< Date: Sat, 23 Aug 2008 00:19:47 GMT
<< Expires: Tue, 21 Aug 2018 00:19:47 GMT
<< X-Cache: MISS from someserver.com
<< X-Cache-Lookup: MISS from someserver.com

Here it’s clear the second response was not served by the proxy: the caching response headers say MISS, the Date and Expires values change, and tailing the stevesouders.com access log shows two hits.

This shouldn't be taken lightly - when accessing a website physically located on the other side of the world response times can be very slow. Getting an answer from a proxy server located along the route can mean the difference between a website being usable or not - in the case of cached-forever resources it means the first load of a url is slow, in the case of using validation requests it means the whole site will be sluggish.

Instead version-control assets

The "best" solution is to version control files such that whenever the content changes so does the url. Normally that would be automated as part of the build process.

However a near-compromise to that is to implement a rewrite rule such as

# ------------------------------------------------------------------------------
# | Filename-based cache busting                                               |
# ------------------------------------------------------------------------------

# If you're not using a build process to manage your filename version revving,
# you might want to consider enabling the following directives to route all
# requests such as `/css/style.12345.css` to `/css/style.css`.

# To understand why this is important and a better idea than `*.css?v231`, read:
# http://stevesouders.com/blog/2008/08/23/revving-filenames-dont-use-querystring

<IfModule mod_rewrite.c>
   RewriteCond %{REQUEST_FILENAME} !-f
   RewriteRule ^(.+)\.(\d+)\.(js|css|png|jpe?g|gif)$ $1.$3 [L]
</IfModule>

In this way a request for foo.123.css is processed by the server as foo.css - this has all the advantages of using a query parameter for cache busting, but without the problem of disabling proxy caching.

这篇关于文件缓存:查询字符串与上次修改?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆