[wp-trac] [WordPress Trac] #16706: Queries using "category__and" are slow on large databases

WordPress Trac wp-trac at lists.automattic.com
Mon Feb 28 22:20:52 UTC 2011


#16706: Queries using "category__and" are slow on large databases
--------------------------+-----------------------------
 Reporter:  tigertech     |      Owner:
     Type:  defect (bug)  |     Status:  new
 Priority:  normal        |  Milestone:  Awaiting Review
Component:  Database      |    Version:  3.1
 Severity:  normal        |   Keywords:
--------------------------+-----------------------------
 Summary: Using "category!__and" in query_posts() generates a dependent
 subquery in MySQL with extremely poor performance ("Using temporary, Using
 filesort") when one of the categories has a large number of posts. The
 result is so slow that WordPress appears to completely hang. Changing the
 query structure avoids the filesort and solves it.

 Details:

 We have a customer doing this as part of a theme in WordPress 3.1:

 {{{
 query_posts(array(
   "category__and" => array(1, 461),
   "posts_per_page" => 6
 ));
 }}}

 The database is fairly large. There are 45,610 posts in category 1, and
 167 posts in category 461. The resulting database query runs for so long
 that it effectively hangs (it doesn't finish within 30 minutes).

 The generated MySQL query looks like:

 {{{
 SELECT SQL_CALC_FOUND_ROWS  wp_posts.* FROM wp_posts  WHERE 1=1  AND (
 wp_posts.ID IN (
   SELECT object_id
   FROM wp_term_relationships
   WHERE term_taxonomy_id IN (1,461)
   GROUP BY object_id HAVING COUNT(object_id) = 2
 ) ) AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish')
 GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 6;
 }}}

 An "explain" on this query shows that it generates a dependent subquery
 that devolves to a filesort:

 {{{
 +----+--------------------+-----------------------+-------+------------------+------------------+---------+-------------+-------+-----------------------------------------------------------+
 | id | select_type        | table                 | type  | possible_keys
 | key              | key_len | ref         | rows  | Extra
 |
 +----+--------------------+-----------------------+-------+------------------+------------------+---------+-------------+-------+-----------------------------------------------------------+
 |  1 | PRIMARY            | wp_posts              | ref   |
 type_status_date | type_status_date | 44      | const,const |  8177 |
 Using where; Using index                                  |
 |  2 | DEPENDENT SUBQUERY | wp_term_relationships | range |
 term_taxonomy_id | term_taxonomy_id | 8       | NULL        | 25665 |
 Using where; Using index; Using temporary; Using filesort |
 +----+--------------------+-----------------------+-------+------------------+------------------+---------+-------------+-------+-----------------------------------------------------------+
 }}}

 I've tried adding indexes, optimizing, and so on to get the filesort to go
 away, but I can't get it to do so, at least not with MySQL 5.0.x.

 The filesort comes from the "GROUP BY ... HAVING". Changing the query to
 avoid those fixes this. For example, this appears to produce the same
 results:

 {{{
 SELECT SQL_CALC_FOUND_ROWS  wp_posts.* FROM wp_posts  WHERE 1=1  AND
 (
   SELECT COUNT(1)
   FROM wp_term_relationships
   WHERE term_taxonomy_id IN (1,461)
   AND object_id = wp_posts.ID
 ) = 2
 AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish')
 GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 6;
 }}}

 But it finishes in a fraction of a second and generates cleaner "explain"
 output with no filesort:

 {{{
 +----+--------------------+-----------------------+------+--------------------------+------------------+---------+----------------------+------+--------------------------+
 | id | select_type        | table                 | type | possible_keys
 | key              | key_len | ref                  | rows | Extra
 |
 +----+--------------------+-----------------------+------+--------------------------+------------------+---------+----------------------+------+--------------------------+
 |  1 | PRIMARY            | wp_posts              | ref  |
 type_status_date         | type_status_date | 44      | const,const
 | 8177 | Using where              |
 |  2 | DEPENDENT SUBQUERY | wp_term_relationships | ref  |
 PRIMARY,term_taxonomy_id | PRIMARY          | 8       |
 database.wp_posts.ID |    1 | Using where; Using index |
 +----+--------------------+-----------------------+------+--------------------------+------------------+---------+----------------------+------+--------------------------+
 }}}

 So my suggestion is that the "category!__and" query be changed to
 something like this that avoids filesorts. (I could provide a patch if
 people agree that this change is a reasonable approach.)

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/16706>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list