[wp-trac] [WordPress Trac] #18105: Improve JOIN efficency of tax_query

WordPress Trac wp-trac at lists.automattic.com
Thu Jul 14 04:28:59 UTC 2011


#18105: Improve JOIN efficency of tax_query
-------------------------+-----------------------------
 Reporter:  Otto42       |      Owner:
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  Awaiting Review
Component:  Database     |    Version:
 Severity:  normal       |   Keywords:  needs-patch
-------------------------+-----------------------------
 Use of multiple items in a tax_query creates an inner join for each entry.
 This seems unnecessary.

 Example query:

 {{{
 array(
   'tax_query' => array(
     'relation' => 'OR',
     array('taxonomy' => 'tax1', 'field' => 'slug', 'terms' => 'term1'),
     array('taxonomy' => 'tax2', 'field' => 'slug', 'terms' => 'term2'),
   )
 )
 }}}

 Resulting SQL:

 {{{
 SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts
 INNER JOIN wp_term_relationships ON (wp_posts.ID =
 wp_term_relationships.object_id)
 INNER JOIN wp_term_relationships AS tt1 ON (wp_posts.ID = tt1.object_id)
 WHERE 1=1 AND
 ...
 AND (wp_term_relationships.term_taxonomy_id IN (XXX)
   OR tt1.term_taxonomy_id IN (YYY) )
 ...
 }}}

 Each new entry in the array creates another inner join on the
 term_relationships table, along with an associated select argument.

 However, the term_relationships table is defined with these main keys:

 {{{
 CREATE TABLE $wpdb->term_taxonomy (
  term_taxonomy_id bigint(20) unsigned NOT NULL auto_increment,
  term_id bigint(20) unsigned NOT NULL default 0,
  taxonomy varchar(32) NOT NULL default '',
 ...
  PRIMARY KEY  (term_taxonomy_id),
  UNIQUE KEY term_id_taxonomy (term_id,taxonomy),
 }}}


 This basically means that the term_taxonomy_id is a unique ID for the row,
 and it's pointing to a unique relationship between some term and some
 taxonomy. Essentially, term_relationships defines what terms are in what
 taxonomies.

 Therefore, the extra inner join and selection are entirely unnecessary.
 All references to a term in a taxonomy are unique. This isn't individual
 terms we're looking for in this query, but term+taxonomy.

 An equivalent SQL statement to the one given before would be this:

 {{{
 SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts
 INNER JOIN wp_term_relationships ON (wp_posts.ID =
 wp_term_relationships.object_id)
 WHERE 1=1 AND
 ...
 AND (wp_term_relationships.term_taxonomy_id IN (XXX, YYY) )
 ...
 }}}

 This is equivalent because both XXX and YYY point to a specific
 term+taxonomy combination, not just to a term.

 For the similar relation = AND case (instead of relation = OR), this query
 would be equivalent:


 {{{
 SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts
 INNER JOIN wp_term_relationships ON (wp_posts.ID =
 wp_term_relationships.object_id)
 WHERE 1=1 AND
 ...
 AND (wp_term_relationships.term_taxonomy_id IN (XXX)
 AND (wp_term_relationships.term_taxonomy_id IN (YYY) )
 ...
 }}}

 We get a second statement in the where clause, but note the lack of a
 second inner join and lack of referencing it. This greatly improves
 performance.

 Only one inner join is ever actually required for any number of items in
 the tax_query, thanks to the pre-lookups being performed to find the
 proper term_relationship entries.

 Similar enhancements might be available for the meta table, however that
 is more complex and may be more difficult to improve.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/18105>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list