[wp-trac] [WordPress Trac] #18105: Improve JOIN efficency of tax_query

WordPress Trac noreply at wordpress.org
Wed Oct 15 15:19:47 UTC 2014


#18105: Improve JOIN efficency of tax_query
-------------------------+---------------------------
 Reporter:  Otto42       |       Owner:  boonebgorges
     Type:  enhancement  |      Status:  accepted
 Priority:  normal       |   Milestone:  4.1
Component:  Query        |     Version:
 Severity:  normal       |  Resolution:
 Keywords:               |     Focuses:
-------------------------+---------------------------
Changes (by boonebgorges):

 * keywords:  needs-patch =>
 * owner:   => boonebgorges
 * status:  new => accepted
 * component:  Database => Query
 * milestone:  Awaiting Review => 4.1


Comment:

 Resurrecting this one. marcus.downing is correct that we can't do it in
 all places, but otto42 is right that we can make improvements. Here's a
 summary of the situation in WP_Tax_Query:

 * Operators 'NOT IN', 'AND', 'EXISTS', and 'NOT EXISTS' are transformed
 into subqueries, so the question of table joins is moot.
 * That leaves us with operator 'IN'. There are two cases:
     * IN clauses joined by OR. In these cases, only one JOIN is required
 (this is otto42's original situation)
     * IN clauses joined by AND. These require a separate JOIN. (FWIW, the
 situation is switched for NOT IN: AND can share a join, while OR cannot.
 But, as noted above, we're not using joins for NOT IN.)

 So: when IN clauses are joined by OR, we should avoid extraneous JOINs.
 There are two general strategies. One is to handle this early: `IN (3,4,5)
 OR IN (6,7,8)` is logically equivalent to `IN (3,4,5,6,7,8)`, so we could
 detect the params passed to `WP_Tax_Query` are not optimal, and then
 combine them. This ends up being harder than it sounds, because a tax
 query can include clauses with different 'fields', which can only be
 combined after transforming them all into `term_taxonomy_ids`. This is
 inefficent in some cases.

 The other strategy is to do the following when building the SQL for an
 individual clause:
 - Before creating a JOIN, check to see whether any sibling clauses have
 already created a compatible JOIN. ("Sibling" is a first-order clause
 under the scope of the same relation. "Compatible" means that it can share
 a join - both are IN, and they're connected by OR.) If so, no need to join
 again - use the sibling's table alias.
 - If no compatible sibling is found, then create a JOIN. Store the table
 alias, so that later compatible siblings will find it.

 A patch is incoming that does all of these things. It's abstracted just a
 little more than it has to be, because we're going to be able to do
 something very similar (and with greater effect!) in `WP_Meta_Query` -
 but, as zuzya notes, meta queries are more complicated in this regard. See
 #24093.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/18105#comment:8>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list