[wp-trac] [WordPress Trac] #10576: Use subqueries for taxonomy queries

WordPress Trac wp-trac at lists.automattic.com
Mon Aug 10 22:47:05 UTC 2009


#10576: Use subqueries for taxonomy queries
----------------------------+-----------------------------------------------
 Reporter:  ryan            |       Owner:  filosofo
     Type:  task (blessed)  |      Status:  new     
 Priority:  normal          |   Milestone:  2.9     
Component:  Taxonomy        |     Version:          
 Severity:  normal          |    Keywords:          
----------------------------+-----------------------------------------------

Comment(by Denis-de-Bernardy):

 actually, that particular one would be '''much''' faster with a join:

 {{{
 SELECT $wpdb->posts.ID FROM  $wpdb->posts
 JOIN $wpdb->term_relationships
 ON $wpdb->term_relationships.object_id = $wpdb->posts.ID
 JOIN $wpdb->term_taxonomy
 ON $wpdb->term_taxonomy.term_taxonomy_id =
 $wpdb->term_relationships.term_taxonomy_id
 AND $wpdb->term_taxonomy.term_id = '$cat_id'
 AND $wpdb->term_taxonomy.taxonomy = 'category'
 WHERE $wpdb->posts.post_type = 'post'
 AND $wpdb->posts.post_status = 'draft' "
 ORDER BY $wpdb->posts.post_date DESC LIMIT $limit OFFSET $offset
 }}}

 else, you end up with a temporary table, a merge join plan, and no index
 whatsoever if the mysql optimizer misses the potential optimization.

 the same kind of query holds for tags. for a combo of one tag and one
 category, using a big join will be fastest.

 the issue is when we query for multiple terms. one with multiple tags and
 categories would become something like the following and it no longer uses
 any kind of meaningful index, because it's a tough one to for optimizers
 to get right:

 {{{
 SELECT $wpdb->posts.ID FROM  $wpdb->posts
 JOIN $wpdb->term_relationships as cat_rel
 ON cat_rel.object_id = $wpdb->posts.ID
 JOIN $wpdb->term_taxonomy as cat_tax
 ON cat_tax.term_taxonomy_id = cat_rel.term_taxonomy_id
 AND cat_tax.term_id IN ( '$cat_ids' )  AND cat_rel.taxonomy = 'category'
 JOIN $wpdb->term_relationships as tag_rel
 ON tag_rel.object_id = $wpdb->posts.ID
 JOIN $wpdb->term_taxonomy as tag_tax
 ON tag_tax.term_taxonomy_id = tag_rel.term_taxonomy_id
 AND tag_tax.term_id IN ( '$tag_ids' )  AND tag_tax.taxonomy = 'tag'
 WHERE $wpdb->posts.post_type = 'post'
 AND $wpdb->posts.post_status = 'draft' "
 GROUP BY $wpdb->posts.ID
 ORDER BY $wpdb->posts.post_date DESC LIMIT $limit OFFSET $offset
 }}}

 where:

 {{{
 $cat_ids = join("', '", array_map('intval', $cat_ids));
 $tag_ids = join("', '", array_map('intval', $tag_ids));
 }}}

 the latter is a case where, intuitively, a subquery along the lines of
 your example would probably win based on the table stats because we then
 get rid of the group by and end up with a hash join plan. but it'll
 probably be the exception.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/10576#comment:2>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list