[wp-trac] [WordPress Trac] #21688: Add sanity checks and improve performance when searching for posts
WordPress Trac
wp-trac at lists.automattic.com
Sun Aug 26 22:05:44 UTC 2012
#21688: Add sanity checks and improve performance when searching for posts
-------------------------+------------------------------
Reporter: azaozz | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Query | Version:
Severity: normal | Resolution:
Keywords: |
-------------------------+------------------------------
Comment (by azaozz):
In 21688-4.patch changed `strlen($term) < 3` to use string index instead.
`$string{3}` returns the third byte of `$string` regardless of strlen(),
mb_strlen() or mbstring.func_overload. This is combined with checking for
an empty term: `empty( $term{2} )`.
The purpose is to exclude all terms that are one or two characters long.
It doesn't make sense to use them, example: `LIKE '%ab%'` would match all
posts in many languages. The matches will be irrelevant and the search
will be slower.
The above code is the fastest and simplest way to do this. However it's
not very precise. It treats "higher" UTF-8 characters like `è, ä` as one
letter and misses some like `ọ` (which is 3 bytes). On the other hand that
is useful to not remove terms like `東京` (Tokio) which would be removed if
we use mb_strlen().
Thinking this is an acceptable compromise. We may allow some shorter UTF-8
terms that are not essential but won't discard any terms that are needed.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/21688#comment:12>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list