[wp-trac] Re: [WordPress Trac] #7415: "using filesort" in default install

WordPress Trac wp-trac at lists.automattic.com
Mon Jan 19 16:19:12 GMT 2009


#7415: "using filesort" in default install
------------------------------------------------------------+---------------
 Reporter:  dbuser123                                       |        Owner:  anonymous
     Type:  defect (bug)                                    |       Status:  new      
 Priority:  high                                            |    Milestone:  2.8      
Component:  Optimization                                    |      Version:           
 Severity:  major                                           |   Resolution:           
 Keywords:  database, optimization, slow queries, filesort  |  
------------------------------------------------------------+---------------
Comment (by Denis-de-Bernardy):

 correct me if I'm wrong, but... on the post meta ordering, the query plan
 would almost certainly be an index scan on ID followed by a quick sort on
 the tiny number of rows returned. it's no big deal to run a quick sort at
 that stage -- but I still agree that it was not necessary either.

 On this: "SELECT * FROM wp_posts WHERE (post_type = 'page' AND post_status
 = 'publish') ORDER BY menu_order, post_title ASC Has to be ordered
 manually again because there's no appropriate index." I believe your
 assumption is wrong.

 the true constraint here is post_type = page and post_status = publish,
 and the goal is to fetch them all, sorted. adding an "appropriate" index
 to (menu_order, post_title), as you seem suggest, would be completely
 pointless unless you limit the results to a tiny number of items. (then,
 and only then, would the optimizer identify that a nested loop over the
 index would be appropriate.)

 you do raise a point in that there is no appropriate index though. the
 wp_posts table has:

 KEY `type_status_date` (`post_type`,`post_status`,`post_date`,`ID`)

 but that one is probably so large, and the db stats are so, that the
 optimizer ignores it and prefers a seq scan for this particular query. it
 could, and probably should, be replaced by:

 KEY `status_type_date` (`post_status`,`post_type`,`post_date`)
 KEY `status_type_menu` (`post_status`,`post_type`,`menu_order`)

 that way, the optimizer would at least consider the first for posts, and
 the second for pages. this smaller index probably gets the job done too:

 KEY `type_status_menu` (`post_status`, `post_type`)

 I'm not 100% on the MySQL internals. But PostgreSQL would almost certainly
 prefer the smaller index, and do a top-N search in the resulting set where
 necessary, if the db stats show that using an index at all makes any
 sense.

 Speaking of db stats, it seems to me that using an index only makes sense
 when it lets you go straight at the a small subset of data. If you need to
 sort 9,900 out of 10,000 entries that match post_type = post and
 post_status = published, the optimizer will prefer a seq scan, load the
 entire mess in memory, and manually sort it no matter what. It'll only
 consider the index once you've added several thousands of post revisions,
 or if you're retrieving a small subset of rows using a LIMIT clause.

 As a complete aside, the optimization using the union introduces a unique
 and an extra sort operator in the query plan, and it probably isn't an
 option due to the complicated stuff in the WP_Query class. the same
 optimization can probably be achieved, however, by using IN ('publish',
 'private') rather than the OR statement.

-- 
Ticket URL: <http://trac.wordpress.org/ticket/7415#comment:9>
WordPress Trac <http://trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list