[wp-trac] [WordPress Trac] #34239: Split queries in `get_terms()` and `wp_get_object_terms()`
WordPress Trac
noreply at wordpress.org
Sat Oct 10 04:00:58 UTC 2015
#34239: Split queries in `get_terms()` and `wp_get_object_terms()`
----------------------------+----------------------------
Reporter: boonebgorges | Owner:
Type: task (blessed) | Status: new
Priority: normal | Milestone: Future Release
Component: Taxonomy | Version:
Severity: normal | Keywords: needs-patch
Focuses: performance |
----------------------------+----------------------------
The cache strategy in `get_terms()` looks like this:
1. Query for full rows from the term/term_taxonomy tables.
2. Update the cache for the individual terms found.
3. Store the results of the query in a single cache key.
`wp_get_object_terms()` does 1 and 2, but doesn't cache its query results
at all.
This is a wasteful strategy.
* In the case of `get_terms()`, term objects can be stored dozens of times
in the cache. Each combination of arguments passed to `get_terms()`
results in a separate cache key, and each of these slots is home to full
term items. We don't have problems with invalidation because we're using
the sledgehammer of a 'last_changed' incrementor, but we are certainly
taking up more space in persistent cache storage than we need to.
* `$wpdb->get_results( "SELECT * FROM ..." )` is significantly more
expensive than `$wpdb->get_col( "SELECT term_id FROM ..." )`, both in
terms of the memory used to store results and the index strategy that
MySQL can use. Selecting full results was probably necessary when terms
could be shared between taxonomies - `term_id` wouldn't have been a unique
identifier - but we shouldn't have that problem anymore.
Let's split the query in both of these functions, so that we do the
following:
* The primary SQL query will fetch term_ids only. `$wpdb->get_col( "SELECT
term_id FROM..." )`
* We'll cache the results of that query, which will be an array of
integers, instead of an array of `stdClass` objects with a bunch of data
in them, using the existing 'last_changed' technique. (For now, we can
limit ourselves to caching only `get_terms()` queries. It'd be great to do
`wp_get_object_terms()` too, but this will probably take some more
research.)
* Do a `SELECT * FROM ... WHERE term_id IN (...)` query to get data
corresponding to uncached terms, then prime the cache for those terms.
* Fill the term objects from the cache before returning from the function.
Both `get_terms()` and `wp_get_object_terms()` support a 'fields'
parameter, which allows the selection of only a subset of fields, as an
alternative to 'all' (or 'all_with_object_id'). At some point down the
line, it'd be great to convert as many of these as possible to use the
persistent cache described above. But for now, it probably makes sense to
start with the 'all' queries only.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/34239>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list