[wp-trac] Re: [WordPress Trac] #9167: query_posts('meta_key=foo') returns duplicate posts

WordPress Trac wp-trac at lists.automattic.com
Thu Feb 19 14:32:19 GMT 2009


#9167: query_posts('meta_key=foo') returns duplicate posts
--------------------------+-------------------------------------------------
 Reporter:  scribu        |       Owner:  anonymous
     Type:  defect (bug)  |      Status:  new      
 Priority:  normal        |   Milestone:  2.7.2    
Component:  General       |     Version:           
 Severity:  normal        |    Keywords:           
--------------------------+-------------------------------------------------

Comment(by filosofo):

 Replying to [comment:6 scribu]:
 > If you write ''SELECT DISTINCT(*)'', no, it won't; but if you write
 ''SELECT DISTINCT(wp_posts.ID), wp_posts.post_author, etc.'' it will work.
 The only problem is that you couldn't ''SELECT wp_posts.*'' anymore.

 That's not exactly correct.  It doesn't matter what wp_posts fields you
 select, because when the post ID is the same they're always going to be
 the same.  What matters---what causes the duplication---is that WP
 currently selects the wp_postmeta.meta_value field, which presumably is
 different for each meta_key that is the same.  So the current query
 results look like this (omitting some columns):

 {{{
 +-----+-------------------------+-------------------+
 | ID  | post_title              | meta_value        |
 +-----+-------------------------+-------------------+
 | 1   | A Post                  | meta value x      |
 | 1   | A Post                  | meta value y      |
 | 2   | Another Post            | meta value y      |
 +-----+-------------------------+-------------------+
 }}}

 Here each ''row'' is distinct, because post 1 has two entries for the same
 meta_key but different {{{meta_value}}}s.  To make the rows distinct
 without duplicating posts, you would have to not select the meta_value.
 Not selecting the meta_value probably isn't very helpful for most of the
 situations in which you would be querying by meta_key.

 > I think a subquery also has a performance penalty. Nothing you can do
 about that.

 It's a matter of relative harm.  To make DISTINCT work in the case we're
 talking about, MySQL has to determine whether an entire row is distinct,
 which with all the columns selected in a Loop query is going to be more
 work than just a simple sub-query.  In my admittedly unscientific tests it
 seems to be significantly faster.

 > By the way, I see that subqueries are already used in WP 2.7.1 (in
 query.php):

 Conditionally, if someone is using MySQL 4.1 or newer.  And how many
 aren't?  MySQL 4.1 was released in 2004.  Time to require it.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/9167#comment:7>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list