[wp-trac] [WordPress Trac] #57627: The Cache-Control header for logged-in pages should include `private`
WordPress Trac
noreply at wordpress.org
Fri Feb 3 16:49:29 UTC 2023
#57627: The Cache-Control header for logged-in pages should include `private`
--------------------------+-----------------------------
Reporter: markdoliner | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
I believe WordPress returns the following Cache-Control header for pages
that are rendered for logged-in users:
{{{
Cache-Control: no-cache, must-revalidate, max-age=0
}}}
I think the relevant code is
[https://build.trac.wordpress.org/browser/tags/6.1.1/wp-includes/class-
wp.php#L424 here] and [https://build.trac.wordpress.org/browser/tags/6.1.1
/wp-includes/functions.php#L1485 here].
For pages for logged-in users I believe this header should be modified to
include the `private` directive to indicate that the response should not
be cached by intermediary shared cache servers.
The change should not be made everywhere `nocache_headers()` is used--only
for responses that vary based on the logged-in user. And maybe also for
users who have recently left a comment (#16612 is related), though it
seems like this is hard for the server to know reliably. You could key off
the presence of one of the `comment_author_*` cookies but those aren't
always set.
==== The Meanings of `no-cache` and `private`
You might think that `no-cache` would be sufficient to accomplish this,
but it's not. It's a bit confusing but `no-cache` means "this response may
be stored in a cache but it must be revalidated before it is used." And so
I believe that shared cache servers are allowed to cache pages rendered
for logged-in users.
I've found [https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
MDN's caching guide] to be helpful while trying to understand the meaning
of the various directives. The Private Caches section says, "If a response
contains personalized content and you want to store the response only in
the private cache, you must specify a `private` directive." It's
reiterated in the "Do Not Share With Others" section under "Don't Cache."
And MDN's [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers
/Cache-Control Cache-Control header reference] contains a similar
statement.
==== What's the Harm?
And of course the risk isn't just that the page is ''cached'' on a shared
server, but that it's served to a user other than the logged-in user.
Thankfully I think the risk is minimal for a few reasons:
1. `no-cache` means the cache will attempt to revalidate the page before
using it. I believe revalidation is not possible by default because
WordPress does not set the ETag or Last-Modified header for these
responses. Though this isn't a guarantee: Someone could configure their
web server or a caching reverse proxy server to set the headers and return
HTTP 304 if appropriate. Or a plugin could do these things. The WP Super
Cache plugin even has options for "304 Browser caching" and "Enable
caching for all visitors" (even logged-in visitors), though I couldn't get
it to serve a logged-in page to a non-logged-in user so it looks like it's
clever enough to use different cached data based on the user's cookie (I
see that `Cookie` is added to the Vary header), so that's great.
2. When used as caching reverse proxies Nginx and Varnish appear to not
cache responses if the Cache-Control header includes `no-cache`, so they
won't cache pages for logged-in users. For Nginx I think it's
[https://github.com/nginx/nginx/blob/dad65f3e449f215469943628f2b1f12a118fcf7e/src/http/ngx_http_upstream.c#L4811
this logic]. For Varnish I think it's [https://github.com/varnishcache
/varnish-
cache/blob/582ded6a2d6ae1a4467b1eb500f2725b42888016/bin/varnishd/builtin.vcl#L212-L240
this logic]. I think they're ''allowed'' to cache these responses and it
seems possible that they will in the future, but they don't currently. And
as a counter example I believe Squid ''is'' willing to cache these
responses ([https://wiki.squid-cache.org/SquidFaq/InnerWorkings#how-come-
some-objects-do-not-get-cached this FAQ] is related but not super clear).
3. I suspect shared cache servers are uncommon (thought I've made no
attempt to find data about it).
4. The number of https sites has increased greatly over time and shared
cache servers can't cache objects served over https (unless they decrypt
and reencrypt the data, which is mostly only possible in company-managed
computers where the company is able to add their own signing certificate
to the browser trust store).
==== So Why Should We Change It?
While I think it's rare that the lack of `private` will cause harm,
WordPress is widely used and there are many ways to configure cache-
related headers. I'd guess there is a non-zero chance that this problem
has surfaced at some point in time and so I feel that it's worth changing.
The risk from adding the header feels low to me.
I'll caveat this ticket by saying that I'm not intimately familiar with
caching behavior. I've just been looking at it a lot over the last few
days. It's entirely possible that I'm wrong about all of this.
==== Related Tickets
- #16612 proposes using `nocache headers()` for requests with comment
cookies. That seems appropriate to me, and also using `private`.
- #21938 proposes adding `no-store` to the `nocache headers()` list. This
is a separate consideration from the issue I'm raising above. I don't know
whether it's a good proposal. There's a lot to think about there.
- #22258, #23021, and #40444 dealt with removing Last-Modified from
`nocache_headers()`.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/57627>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list