[wp-trac] [WordPress Trac] #59516: Improve cache key generation in query classes
WordPress Trac
noreply at wordpress.org
Tue Oct 24 12:53:50 UTC 2023
#59516: Improve cache key generation in query classes
--------------------------------------+-----------------------------
Reporter: spacedmonkey | Owner: thekt12
Type: defect (bug) | Status: assigned
Priority: normal | Milestone: Future Release
Component: Query | Version:
Severity: normal | Resolution:
Keywords: has-patch has-unit-tests | Focuses: performance
--------------------------------------+-----------------------------
Comment (by thekt12):
= Findings:
== WP_Query
Inside WP_Query::get_posts(), cache key is generated using
`generate_cache_key( $q, $new_request );`; where `$q` is query args and
`$new_request` is the sql that is formed, (check code
[https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-query.php#L3175 here])
{{{#!php
$cache_key = $this->generate_cache_key( $q, $new_request );
}}}
Setting of cache for the $cache_key happens at 3 (
[https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-query.php#L3234 Case1], [https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-query.php#L3267 Case2], [https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-query.php#L3342 Case3] ) places based on different conditions,
all of which consumes the same $cache_key generated above and there is no
modification in between.
For Case1 and Case2, there is a slight modification to the output obtained
from the query based on the value set in `$q['fields']` . In current
scenario `$q['fields']=='ID'` and `$q['fields']=='ids'` can have slightly
different cache values as former doesn’t go through the following
condition which has a `array_map( 'intval', $this->posts );` (''this seems
to be a minor bug'')
In both **Case1** and **Case2** there is no filter in between key
generation and cache setting which can modify the output base on any other
values from **args/$q**.
For Case3 however there is a possibility that the logic may enter the
split_the_query
condition ( if there are less than 500 posts and we split the query to
get ids of those 500 posts and fetch them manually and prime them). The
split_the_query condition contains a filter 'posts_request_ids' that could
modify the request post key generation, this is a bug that is addressed in
([https://core.trac.wordpress.org/ticket/59661 Ticket#59661])
[https://github.com/WordPress/wordpress-develop/pull/5513/files#diff-
1d050668621cbc3028e027d15e68438f4d879dd6e80ba98f9c9f69b9fb5469d0R3181
PR#5513]. This PR introduces a new arg variable
`posts_request_ids_filters` which also affects what is cached.
To add to it we can
What can be removed?
After Case3, the variable $cache_key is not in use anywhere indicating
that the only args that must be used for cache key generation is
$q['fields'] and $q['posts_request_ids_filters'](introduced in
[https://github.com/WordPress/wordpress-develop/pull/5513/files#diff-
1d050668621cbc3028e027d15e68438f4d879dd6e80ba98f9c9f69b9fb5469d0R3181
PR#5513]).
We can thereby remove all the query forming (except fields) and after
cache decision making (except 'posts_request_ids_filters') ( which is
pretty much everything mentioned here )
With respect to Third Parties, $args can only be used to cache the same
values to different keys (which has no use IMO) or to skip most of the
logic inside get_posts() like incase of VIP Enterprise Search (link),
where a new `es` is used to decide if the query must be sent to elastic
search or not.
== WP_Term_Query
For term_queries, `$args['fields']`, `$args['object_ids']`,
`$args['pad_counts']`, `$args['hide_empty']` are the only fields that can
affect caching. All other variables can be avoided from cache key
generation. Like WP_Query there is no filter that third parties can use to
modify what is being cached after cache key generation.
== WP_Network_Query
The key is entirely based on args, however none of the args affect the
outcome post query is fired. So we could modify the code to be more like
WP_Query with a separate function for generate_key based on the request
and args or just the args. ([https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-network-query.php#L249 code])
== WP_Site_Query
Same as WP_Network_Query. The code needs to be modified to use final query
and not args. ([https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-site-query.php#L357 code])
== WP_Comment_Query
Same like WP_Network_Query and WP_Site_Query
([https://github.com/WordPress/wordpress-
develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-includes
/class-wp-comment-query.php#L451 code])
= Proposal
**Step 1: Standardize code**
Standardize WP_Network_Query, WP_Site_Query, WP_Comment_Query to use
generate_cache_key function the same way WP_Query does.
Note: There is one performance downgrade here. The way caching is
implemented in all 3 class above it skips the request generation part
entirely if cache is present. However, the upside it the query will be
cached based on the final request making it better at handling similar
queries generated via different args setup.
For e.g Caching in WP_Network_Query skips [https://github.com/WordPress
/wordpress-develop/blob/781953641607c4d5b0743a6924af0e820fd54871/src/wp-
includes/class-wp-network-query.php#L256 $network_ids =
$this->get_network_ids();] which is logic intensive.
**Step 2: Use only those key from**
We should only pass the args (filter it before) to `generate_cache_key()`
for cache key generation that effects the way cache is done (Detailed in
the Findings section above).
For filtering have a array of filed that needs to be considered in an
array and only scrap out those args from the main args.
For e.g `$key_to_consider = array( ‘fields’ )` for WP_Query, we can
provide a filter which will allow third parties to add more keys that
should be considered, extra care must be given not to unset key from the
original `$args`.
**Step 3: Normalize and save back to main $args or (to the clone)**
Basic normalization must be done before query execution and saved to
original args->
Some examples->
‘ids’, ‘ID’, [‘ID’] from $args[ ‘fields’ ] must be converted to ‘ids’.
‘post_name’ and other fields that contains that contain single array needs
to be converted to string.
**Step 4: Move generate_cache_key to utility function as it will be more
like a repeated piece of code.**
--
Ticket URL: <https://core.trac.wordpress.org/ticket/59516#comment:10>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list