[wp-hackers] Search in 3.0

Andy Skelton skeltoac at gmail.com
Tue Feb 16 00:40:29 UTC 2010


Since mentoring Justin Shreve on the Search API project last summer, I
have wanted to make it easier for plugins to replace the core search
functionality. At first I thought I should roll the results of the
project into core. While trying to do that, I found a simpler
solution. Review the changeset first and then I'll explain its power.
http://core.trac.wordpress.org/changeset/13037

Before this change, you had to do crazy things to override core
search. With the new filter you can do it sanely and get stuff like
paging and privacy checks for free.

These filter usage examples are in the order in which I discovered them:
$search = "AND ID IN (1, 2, 3)"; // known ID's
$search = "AND ID IN (SELECT ID FROM...)"; // subquery
$search = ""; // tricky!

The first example uses a known set of ID's that match the search
query. These might be found by querying an external search API. If the
set of ID's is complete, paging will work as expected. If you can't
use a complete set of matching ID's, paging the results will be a lot
more work.

The second example uses a subquery to find the ID's. Subqueries rock.
You can use them to get the set of ID's from other tables in your
database. But this example has an important limitation: you can't
control the order of the final result set by the order of the subquery
in an IN clause.

The third example is what I'm using right now on skeltoac.com. My
fulltext subquery is in a RIGHT JOIN clause (filter posts_join). This
way I can use the MATCH score to sort the results (filter
posts_orderby). The query ends up looking like this (simplified):
SELECT SQL_COUNT_FOUND_ROWS
  wp_posts.* FROM wp_posts
  RIGHT JOIN (
    SELECT
      post_id,
      MATCH... AS score
    FROM wp_fulltext
    WHERE MATCH...
  ) AS ft ON (ID=ft.post_id)
WHERE 1=1...
ORDER BY score DESC

Here is the live code (about 100 lines):
http://skeltoac.com/fulltext.php.txt

The list of features/benefits:
* Index covers all searchable post types and statuses (stati)
* Index includes title+content+taxonomies
* Fulltext search allows sorting results by relevance
* Works with existing query patterns
* Paging, privacy, etc., all handled by core

And the list of known issues:
* Searchable text (title+content+taxonomies) might have room for improvement
* Missing a bunch of actions to trigger re-indexing a post
* Should not be indexing the whole blog on the activation hook
* Should not assume the indexing can be done within memory limits
* Chronological paging links (Older, Newer) inappropriate for
relavance-ordered results

I just wrote this plugin today, borrowing a few key lines from
Justin's work. If you are interested in search, please review my code
and think about how it works. If you think of any minor core changes
that would make search easier to plug in, please make them known ASAP.
Little things like filters aren't strictly subject to feature freeze
and can add tremendous value to 3.0.

Cheers,
Andy Skelton


More information about the wp-hackers mailing list