[wp-trac] [WordPress Trac] #14722: use MATCH/AGAINST instead of LIKE when querying for matching terms posts

WordPress Trac wp-trac at lists.automattic.com
Sat Aug 28 20:13:55 UTC 2010


#14722: use MATCH/AGAINST instead of LIKE when querying for matching terms posts
--------------------------+-------------------------------------------------
 Reporter:  pgentoo       |       Owner:                 
     Type:  defect (bug)  |      Status:  new            
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  General       |     Version:  3.0.1          
 Severity:  normal        |    Keywords:                 
--------------------------+-------------------------------------------------
 In query.php, in &get_posts(), the generated sql is suboptimal.  I
 maintain several large wordpress installations (just shy of 200k posts
 each) and query times search for posts to display are taking around 6
 seconds (on a dedicated mysql VM).  I looked into this and the majority of
 the time shown in the profiler is from reading records from disk, due to a
 full table scan.  I see that the _posts table already has a multicolumn
 FULLTEXT index on the post_title and post_content columns.

 My recommendation is to update the PHP to generate MATCH() ... AGAINST()
 queries instead of LIKE queries, so that the existing FULLTEXT index can
 be leveraged.

 I've pulled some test queries from my mysqldumpslow log, and made the
 suggested changes, which drops query times from >6s to ~.2s.

 Here is an example modified query:

 SELECT SQL_CALC_FOUND_ROWS  wp_rlu4532_posts.* FROM wp_rlu4532_posts
 WHERE 1=1  AND (((wp_rlu4532_posts.post_title LIKE '%york%') OR
 (wp_rlu4532_posts.post_content LIKE '%york%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%peace%') OR
 (wp_rlu4532_posts.post_content LIKE '%peace%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%festival%') OR
 (wp_rlu4532_posts.post_content LIKE '%festival%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%returns%') OR
 (wp_rlu4532_posts.post_content LIKE '%returns%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%to%') OR
 (wp_rlu4532_posts.post_content LIKE '%to%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%rowntree%') OR
 (wp_rlu4532_posts.post_content LIKE '%rowntree%')) AND
 ((wp_rlu4532_posts.post_title LIKE '%park%') OR
 (wp_rlu4532_posts.post_content LIKE '%park%')) OR
 (wp_rlu4532_posts.post_title LIKE '%york peace festival returns to
 rowntree park%') OR (wp_rlu4532_posts.post_content LIKE '%york peace
 festival returns to rowntree park%'))  AND (wp_rlu4532_posts.post_password
 = '')  AND wp_rlu4532_posts.post_type = 'post' AND
 (wp_rlu4532_posts.post_status = 'publish')  ORDER BY
 wp_rlu4532_posts.post_date DESC LIMIT 0, 5;


  SELECT SQL_CALC_FOUND_ROWS  wp_rlu4532_posts.* FROM wp_rlu4532_posts
 WHERE
 MATCH (post_title, post_content) AGAINST ('york')
 AND
 MATCH (post_title, post_content) AGAINST ('peace')
 AND
 MATCH (post_title, post_content) AGAINST ('festival')
 AND
 MATCH (post_title, post_content) AGAINST ('returns')
 AND
 MATCH (post_title, post_content) AGAINST ('york')
 AND
 MATCH (post_title, post_content) AGAINST ('to')
 AND
 MATCH (post_title, post_content) AGAINST ('rowntree')
 AND
 MATCH (post_title, post_content) AGAINST ('park')
 AND
 MATCH (post_title, post_content) AGAINST ('york peace festival returns to
 rowntree park')
 AND (wp_rlu4532_posts.post_password = '')  AND wp_rlu4532_posts.post_type
 = 'post' AND (wp_rlu4532_posts.post_status = 'publish')  ORDER BY
 wp_rlu4532_posts.post_date DESC LIMIT 0, 5;

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/14722>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list