[wp-trac] [WordPress Trac] #29201: File versioning should not use query strings, but rename the filename to allow caching
WordPress Trac
noreply at wordpress.org
Fri Jun 24 18:15:00 UTC 2016
#29201: File versioning should not use query strings, but rename the filename to
allow caching
---------------------------+--------------------------
Reporter: benoitchantre | Owner:
Type: enhancement | Status: closed
Priority: normal | Milestone:
Component: Script Loader | Version: 3.9.1
Severity: normal | Resolution: wontfix
Keywords: | Focuses: performance
---------------------------+--------------------------
Comment (by drzraf):
> I'm not sure what problem you're trying to solve.
- remove a workaround
- simplify my software stack
- remove corner cases/bugs
Query-string appended to static files may causes (non-exhaustive list):
- mime-type to be messed-up
- impossibility to cache resource
- multiple (too many) cached resource
- log files/stats files inconsistencies (same file, multiple URL)
- ...
> > Refreshing downstream proxies is as easy as issuing a "Cache-Control"
or "Pragma", and the webserver as well as the client:
> > If a resource changes, just trigger
> > {{{
> > wp_remote_get($post_url, [ 'blocking' => false 'headers' => array
('Cache-Control' => 'no-cache') ]);
> > }}}
> > * fire a GET with `Cache-Control: no-cache`, or `Pragma: no-cache` (or
both if you want) to said resource and forget (you can be sure the proxy
will kindly refresh its cache)
>
> How does this help? There's no way to know what proxies there are
between some arbitrary visitor and your site. In any case, browsers will
still have the old version cached.
First, please note that changing resource location is '''not''' a cache
invalidation.
But you're right, and I was wrong in the above post: this will '''not'''
refresh '''downstream''' proxies.
It will refresh '''server-side''' proxies = any proxy between public-
address and internal webserver IP (where most reverse-proxies lie).
Although for clarity it could have been written:
`Cache-Control: max-age=0, must-revalidate`
Indeed you're mostly right here, it does not force refresh of down-stream
proxies or user-agent caches.
This is something the user alone (= the HTML webpage) can do, we may want
to split both discussions (downstream proxies / reverse-proxies) to avoid
confusion.
About user-agent cache refresh, there are alternatives (worth noting we
are working around buggy server-side imposed Expire header):
For example [https://developer.mozilla.org/fr/docs/Web/API/Location/reload
window.reload(True)]
which is likely the same as sending XHR + `Cache-Control: max-age=0` for
all enqueued assets.
As a HTML-inlined javascript it will refresh UA cache and intermediary
proxies.
This is better than query-string '''but''' ask the very interesting
question:
''When is it right to triggering the cache-refresh routine (= how does the
WordPress application, when asked to generated HTML, knows whether the
user-agent uses an old version of a static file or not)?''
> To restate what's been mentioned above, a pretty common setup is for the
webserver to issue long expiry times (let's say 1 year).
That's the moot-point, and what needs a better definition before going
effectively forward:
- how common is it?
- does it represents the majority of cache-enabled webservers?
- which OS, distributions, hosting services are known to distribute such a
setup?
It must be added that we '''do have''' control over assets (including
caching options) if we want to, it's just a matters of RewriteRules and
[https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3 Cache-
Control headers]:
> If a response includes both an Expires header and a max-age directive,
the max-age directive overrides the Expires header.
If buggy webservers are not common enough or if there do not represent the
majority of cache-enabled webservers it would fair to assume the
workaround (and performance cost it implies) must impact them rather than
well-behaving proxies. (as a bonus it gives incentive for implementing
cache right)
> There's no way for the server to clear unknown proxy caches.
That right *but* they are cached because WordPress asked it in the first
place, and WordPress decides about this using
[https://tools.ietf.org/html/rfc2616#section-14.9 headers].
Of course in some setup the webserver may try to bypass the application,
but using RewriteRules it's easy to regain full-control.
> There's no way for the server to clear the browser cache. Instead, the
server tells the browser to request a new resource (by altering this query
string parameter).
That's somehow the point of caching and a point for not playing with
`Cache-*` or `Expires` HTTP headers.
> Plugins can, of course, change the core behavior. For example, I know
some sites remove the `?ver=` parameter and replace it with a query string
parameter that corresponds to the `mtime` of the file. This mtime method
is more robust but is hard to implement correctly on sites served by many
webservers. It also may be non-performant on many hosts. Core's `?ver=`
method is good compromise that works pretty well most places.
Are `Expires +1 years` so frequents that they need this compromise into WP
core?
It would be easy for a plugin to introduce one of the various workarounds
for their specific problem?
I bet these `?ver=` are not needed for 100% of non-cached WP instances,
and not needed for 80% of cached WP instances because their (Apache?)
webserver is configured correctly.
> > What/Who may configure *by mistake* WP to set a too-large expire time?
>
> I don't think this long cache expiry choice is a mistake.
See [https://tools.ietf.org/html/rfc2616#section-14.21 this]:
> To mark a response as "never expires," an origin server sends an Expires
date approximately one year from the time the response is sent. HTTP/1.1
servers SHOULD NOT send Expires dates more than one year in the future.
This I'm pretty sure +1y asset caching is mostly a mistake but it's bound
to "Unique Resource Locator" definition/interpretation and related RFCs.
RFC2616 terms do not imply a widespread use of such a caching policy.
Paraphrasing this, it's saying to the UA ''Assume that the webpage will
point you to the newer resource.''
(people caching WP front-page some minutes or some hours breaks the
assumption that HTML page is the (only) way to refresh the assets)
It's all about website visitor patterns and website assets upgrade
transitions (and also about whether HTML output itself is cached or not).
The query-string method is a way to keep webpage and assets in sync' in a
cache-enabled context and thus avoid this kind of questions:
- do we accept old CSS for an NEW page?
- do we accept new CSS for an OLD page?
In one hand it implies that `jquery.js?ver=1.2.3` will be universally OK,
but on the other one non-suffixed version `jquery.js` will be inconsistent
(according to my place in the network I would be given a different
resource).
The logical implication of a +1y `Expires` for WP would to explicitly put
versioning inside the filenames, ex: `jquery-1.2.3.js` rather than using
query-string.
But please leave that to "long-expires" webservers (or those, like
[https://developer.yahoo.com/performance/rules.html#expires Yahoo!] ones
who are ready to deal with the side-effects it induces)
--
Ticket URL: <https://core.trac.wordpress.org/ticket/29201#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list