[wp-trac] [WordPress Trac] #37189: In wp_term_query on cache ids
WordPress Trac
noreply at wordpress.org
Tue Jun 20 17:50:06 UTC 2017
#37189: In wp_term_query on cache ids
--------------------------+--------------------------
Reporter: spacedmonkey | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: 4.9
Component: Taxonomy | Version: 4.6
Severity: normal | Resolution:
Keywords: has-patch | Focuses: performance
--------------------------+--------------------------
Changes (by boonebgorges):
* keywords: has-patch needs-unit-tests => has-patch
* milestone: Future Release => 4.9
Comment:
Thanks for continuing to work on this, @spacedmonkey.
Your patch generated a couple of PHPUnit failures for me (outside of the
taxonomy component). As I started digging into why, it became clear to me
that our efforts would be better spent doing the right thing and simply
splitting the query.
Specifically: The fact that you switched to `SELECT t.*, t.**` for most
queries got me concerned that we'd cause memory issues for people using
the `fields` parameter for bulk operations. Yet, because your patch worked
within the existing parameters of term-query caching, there was no easy
way to work around this problem - no way to actually utilize the single-
term cache during `get_terms()` queries. The possibility of additional
overhead in the main `SELECT` query only seems worth the trade-off if we
also offer a more efficient caching layer for single terms. The way we
handle this elsewhere is by doing a `SELECT {id}` query and then filling
the objects from the cache.
[attachment:37189.3.patch] is a first pass at making this happen. It
allows us to simplify the internals of `WP_Term_Query::query()`, but
requires some other fixes. Some notes:
* Your technique for building keys is a bit redundant. Once we've built
the `request`, we have a key that's guaranteed to be unique. So let's use
it, and do away with the `default_args` parsing.
* I've written the patch in such a way that `fields=ids` queries generally
do not require the second query to populate the term objects. The
exception is in the case of hierarchical queries, where the tree must be
descended after the initial query takes place. So this is a fairly big
performance win for most `fields=ids` queries.
* The patch uncovered a pretty deep bug (actually, two related bugs) in
`_split_shared_term()` that was agonizing to figure out. Briefly, the way
that taxonomy hierarchy caches were being regenerated
(`{$taxonomy}_children`) was not consistent: certain child terms were not
properly triggering taxonomy cache clears, and the calls to
`clean_term_cache()` in the context of a `foreach()` loop was causing race
conditions in cases where `get_terms()` descends the hierarchy tree. The
fixes in `_split_shared_term()` should go in, no matter what happens with
the rest of the patch.
* As part of these fixes, I broke the taxonomy-cache busting functionality
into a new `clean_taxonomy_cache()` function.
* On an installation without a persistent cache (or with an empty cache),
term queries now generate two queries instead of one. This requires a
reworking of certain unit tests that count `num_queries`. I've refactored
most of them so that they're not counting the queries specifically, but
instead check for data invalidation (which is the important part anyway).
A few tests - those that rely on examining a `last_query` that no longer
points to the right thing - had to be removed.
I'd welcome a careful review from @spacedmonkey, @flixos90 or anyone else
with some experience in how term caching has historically differed from
caching for posts, comments, etc.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/37189#comment:6>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list