[wp-trac] [WordPress Trac] #59106: Add secondary index wp_posts table to improve media queries performance

WordPress Trac noreply at wordpress.org
Tue Aug 15 11:49:34 UTC 2023


#59106: Add secondary index wp_posts table to improve media queries performance
-------------------------+-----------------------------
 Reporter:  ovidiul      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  Awaiting Review
Component:  Database     |    Version:  trunk
 Severity:  normal       |   Keywords:
  Focuses:               |
-------------------------+-----------------------------
 I've been recently been involved in migrating a large site database to
 MySQL 8.0.

 Upon testing the site, we've noticed that the MySQL 8 queries related to
 the Media library seem to take much longer than on MariaDB.

 Debugging the issue, it seems that MySQL 8, due to its Cost optimiser
 https://dev.mysql.com/doc/refman/8.0/en/cost-model.html, seems to prefer
 to drop the type_status_date index and to a full table scan to retrieve
 the results.

 Doing a FORCE INDEX(type_status_date) on the media query, like



 {{{
 SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
 FROM wp_posts
 FORCE INDEX(type_status_date)
 WHERE 1=1
 AND wp_posts.post_type = 'attachment'
 AND ((wp_posts.post_status = 'inherit'
 OR wp_posts.post_status = 'private'))
 ORDER BY wp_posts.post_date DESC
 LIMIT 0, 20
 }}}



 seems to improve the response time.

 However, a second alternative seems also to show better results,
 specifically targeting only the post_type, post_date and ID, something
 like

 {{{
 create index idx_type_date on wp_posts(post_type, post_date, ID);
 }}}

 Running the explain analyze on the queries with each index, we seem to get
 a better cost number when the new index is used, 142 vs  744, which would
 indicate a better performance at the MySQL level, this being confirmed in
 our tests on a large database

 {{{
 EXPLAIN ANALYZE SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
 FROM wp_posts
 FORCE INDEX(type_status_date)
 WHERE 1=1
 AND wp_posts.post_type = 'attachment'
 AND ((wp_posts.post_status = 'inherit'
 OR wp_posts.post_status = 'private'))
 ORDER BY wp_posts.post_date DESC
 LIMIT 0, 20

 -> Limit: 20 row(s) (no early end due to SQL_CALC_FOUND_ROWS)  (cost=744
 rows=20) (actual time=12..13.1 rows=20 loops=1)
     -> Sort: wp_posts.post_date DESC  (cost=744 rows=3620) (actual
 time=12..12.8 rows=7407 loops=1)
         -> Filter: ((wp_posts.post_type = 'attachment') and
 ((wp_posts.post_status = 'inherit') or (wp_posts.post_status =
 'private')))  (cost=744 rows=3620) (actual time=0.0505..8.37 rows=7407
 loops=1)
             -> Index range scan on wp_posts using type_status_date over
 (post_type = 'attachment' AND post_status = 'inherit') OR (post_type =
 'attachment' AND post_status = 'private')  (cost=744 rows=3620) (actual
 time=0.0459..5.63 rows=7407 loops=1)
 }}}

 {{{
 EXPLAIN ANALYZE SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
 FROM wp_posts
 FORCE INDEX(idx_type_date)
 WHERE 1=1
 AND wp_posts.post_type = 'attachment'
 AND ((wp_posts.post_status = 'inherit'
 OR wp_posts.post_status = 'private'))
 ORDER BY wp_posts.post_date DESC
 LIMIT 0, 20

 -> Limit: 20 row(s) (no early end due to SQL_CALC_FOUND_ROWS)  (cost=142
 rows=20) (actual time=0.214..15.3 rows=20 loops=1)
     -> Filter: ((wp_posts.post_status = 'inherit') or
 (wp_posts.post_status = 'private'))  (cost=142 rows=688) (actual
 time=0.211..15 rows=7407 loops=1)
         -> Index lookup on wp_posts using idx_type_date
 (post_type='attachment') (reverse)  (cost=142 rows=3619) (actual
 time=0.207..13.4 rows=7407 loops=1)
 }}}

 To replicate this, we've installed latest WordPress version and MySQL 8,
 and create the following post_status count structure:

 {{{
 mysql> select post_status, count(*) from wp_posts group by post_status;
 +-------------+----------+
 | post_status | count(*) |
 +-------------+----------+
 | inherit     |     7407 |
 | draft       |        1 |
 | publish     |       23 |
 | auto-draft  |        1 |
 +-------------+----------+
 4 rows in set (0.01 sec)
 }}}

 Can this be considered as a core improvement to the wp_posts table that
 would directly benefit the Media related queries?

 Basically, if MySQL 8 decides that the new index is faster, specially for
 Media search queries, it will simply pick that on up as opposed to using
 the original index or dropping it fully if it considers its much more
 efficient to do a full table scan.

 From our database, a search query for term `test` shows an improvement of
 0.5 seconds of this basic search media query:


 {{{
 MySQL []> SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts FORCE
 INDEX(type_status_date) WHERE 1=1 AND wp_posts.post_type = 'attachment'
 AND ((wp_posts.post_status = 'inherit' OR
 wp_posts.post_status = 'private')) and post_title like "%test%" ORDER BY
 wp_posts.post_date DESC LIMIT 0, 1;
 +---------+
 | ID      |
 +---------+
 | xxxxxxx |
 +---------+
 1 row in set, 1 warning (3.867 sec)

 MySQL []> SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts FORCE
 INDEX(idx_type_date) WHERE 1=1 AND wp_posts.post_type = 'attachment' AND
 ((wp_posts.post_status = 'inherit' OR wppp_posts.post_status = 'private'))
 and post_title like "%test%" ORDER BY wp_posts.post_date DESC LIMIT 0, 1;
 +---------+
 | ID      |
 +---------+
 | xxxxxxx |
 +---------+
 1 row in set, 1 warning (3.244 sec)

 }}}

 Thanks for considering this.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/59106>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list