[wp-trac] [WordPress Trac] #21688: Add sanity checks and improve performance when searching for posts

WordPress Trac wp-trac at lists.automattic.com
Sun Aug 26 22:05:44 UTC 2012


#21688: Add sanity checks and improve performance when searching for posts
-------------------------+------------------------------
 Reporter:  azaozz       |       Owner:
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:  Awaiting Review
Component:  Query        |     Version:
 Severity:  normal       |  Resolution:
 Keywords:               |
-------------------------+------------------------------

Comment (by azaozz):

 In 21688-4.patch changed `strlen($term) < 3` to use string index instead.
 `$string{3}` returns the third byte of `$string` regardless of strlen(),
 mb_strlen() or mbstring.func_overload. This is combined with checking for
 an empty term: `empty( $term{2} )`.

 The purpose is to exclude all terms that are one or two characters long.
 It doesn't make sense to use them, example: `LIKE '%ab%'` would match all
 posts in many languages. The matches will be irrelevant and the search
 will be slower.

 The above code is the fastest and simplest way to do this. However it's
 not very precise. It treats "higher" UTF-8 characters like `è, ä` as one
 letter and misses some like `ọ` (which is 3 bytes). On the other hand that
 is useful to not remove terms like `東京` (Tokio) which would be removed if
 we use mb_strlen().

 Thinking this is an acceptable compromise. We may allow some shorter UTF-8
 terms that are not essential but won't discard any terms that are needed.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/21688#comment:12>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list