[wp-hackers] Development for 2.x : Improved Search

Scott johnson fuzzygroup at gmail.com
Sun Feb 5 09:28:11 GMT 2006


[Thanks to Andy for previewing this before I sent it to the list as a whole]

Now I'm a search geek and I find that WordPress' search functionality just
doesn't cut it.  And I think this is fixable in one of 2 ways -- actually
there are likely Big N ways to fix it but 2 ways I'd be interested in taking

Note: I know that at least one high profile WP User (Om Malik who is
Business 2.0 / Gigaom.com <http://gigaom.com/>) really wants search
"fixed".  And search my blog, fuzzyblog.com for the term money versus google
or blogdigger and its clearly not doing enough.

a) Simple : Add MySQL Full Text Indexing to Wordpress and modify the search
hooks to use it. Moving to FT indices on MyISAM tables gives actually quite
good serch out of the gate.  its not perfect but it scales to like
1.3million posts w/ relatively linear response times.  Its public
that the 1st version of Feedster did MySQL full text until we hit this
point.  its not perfect and there are some character set issues but its a
lot better than what seems to be in place now.

Difficulty: not huge.  Willing to do in full myself.

b) if the desire for N database support means that WP doesn't want to do
this then the next approach is to duplicate the SQL based search approach of
MnogoSearch which uses SQL tables for the core word list and indices.  Its
an interesting approach and would take a bunch of work but I could certainly
do help w/ that but its a long term not short term project and would take
more than just me.

Difficulty: tedious and a fair bit of code.

c) I don't know what hosted WP does for tables but the big limitation on
MyISAM is scalabilty when you generally move to Innodb.  Now innodb, of
course, doesn't have full text indices which raises another set of issues.
Also there are implications on all this if you're using 1 table for 1 users
posts versus 1 table for ALL users posts.  Any insights would be appreciated
(I'm a search guy who's learning to hack WP but knows damn well he doesn't
have all the answers).

d) UI and posts versus pages.  My suggestion is to generate a composite
results page showing something like this imho:

Matching blog posts for FOO:

Sorted by date | [Sort by Relevance]  <== links

1. Blah Bar
2. Blah Foo
3. Blah Gah
4.  blah Etc


Matching pages for FOO:

(same structure)

Thoughts?  *ducks*


More information about the wp-hackers mailing list