[wp-trac] [WordPress Trac] #24093: WP_Meta_Query is inefficient when referencing the same keys in "OR" query
WordPress Trac
noreply at wordpress.org
Thu Oct 16 01:38:59 UTC 2014
#24093: WP_Meta_Query is inefficient when referencing the same keys in "OR" query
----------------------------------+---------------------------
Reporter: sc0ttkclark | Owner: boonebgorges
Type: defect (bug) | Status: accepted
Priority: normal | Milestone: 4.1
Component: Query | Version: 3.5.1
Severity: normal | Resolution:
Keywords: meta-query has-patch | Focuses: performance
----------------------------------+---------------------------
Changes (by boonebgorges):
* keywords: meta-query needs-unit-tests needs-patch => meta-query has-
patch
* milestone: Future Release => 4.1
Comment:
[attachment:24093.5.patch] is an attempt at removing most of the
unnecessary joins from WP_Meta_Query. The basic logic is this:
* Positive operators (those that look for a match rather than non-matches
- such as `IN` and `LIKE`) that appear under the scope of an OR can share
a single table join. Negative operators (`!=`, `NOT IN`, etc) connected by
AND can share a table join as long as the clauses are looking at the same
keys. (I think there are more cases than just this where sharing a join is
possible, but I want to be conservative.)
- So: in each call to `get_sql_for_clause()`, the JOIN logic goes like
this. Does this clause have any siblings that are already doing a join and
are compatible with this clause (by the rules listed above)? If so, use
that alias. If not, create a JOIN clause, and store the alias for use by
later siblings.
So that's the first thing 24093.5.patch does. As I was writing this part
of the patch - which is largely a port of [29902] - I realized that most
of the rat's nest of `WP_Meta_Query::get_sql_for_clause()` was a result of
past developers attempting, in patchwork ways, to avoid unneccessary table
joins. (See, for example, the old 'key only' queries.) With the new
centralized logic for finding "compatible aliases", it became possible to
rewrite the SQL generation logic of `WP_Meta_Query` so that it actually
makes sense.
I've run some preliminary performance tests. I created a database with
about 25000 posts and about 50,000 pseudo-random rows in wp_postmeta. Here
are the queries I ran (never mind the nonsense words :) ):
{{{
$q1 = new WP_Query( array(
'meta_query' => array(
'relation' => 'OR',
array(
'key' => 'horse',
),
array(
'key' => 'nice',
'value' => 'boo',
),
array(
'key' => 'crhrcb',
'value' => array( 'horse', 'boo', 'nice'
),
'compare' => 'NOT IN',
),
array(
'key' => 'flim',
'value' => array( 'horse', 'boo', 'nice'
),
'compare' => 'IN',
),
),
) );
}}}
With 24093.5.patch, the query goes from 4 `INNER JOIN`s to 2 (NOT IN can't
share the join), which cuts execution time roughly in half.
It'd be great to have someone look over the logic above and the logic of
the patch to make sure I'm not missing something obvious. I'm fairly
confident that what I'm proposing here is solid, and it's definitely a big
performance win (even if we continue to explore additional avenues for
performance improvements in WP_Meta_Query).
--
Ticket URL: <https://core.trac.wordpress.org/ticket/24093#comment:34>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list