[wp-meta] [Making WordPress.org] #174: Link to generally related functions/classes
Making WordPress.org
noreply at wordpress.org
Fri Dec 12 20:39:29 UTC 2014
#174: Link to generally related functions/classes
---------------------------+-------------------------------------
Reporter: samuelsidler | Owner: Rarst
Type: enhancement | Status: assigned
Priority: normal | Component: developer.wordpress.org
Resolution: | Keywords:
---------------------------+-------------------------------------
Comment (by keesiemeijer):
+1 for stemming words.
Here's a proof of concept to use the Porter stemming algorithm and other
rules to get related words from the post title.
http://tartarus.org/~martin/PorterStemmer/
Similar words as "queried", "queries", "query" get the same stem "queri"
if passed through the algorithm.
The algorithm is a cheap way of getting stems from words without a
database lookup.
It doesn't always produce real words but it increases the similarity
between titles when queried.
Proof of concept rules for getting related post:
* Get the related words by splitting the title at the underscores.
* Remove stop words as 'the', 'as', 'by' etc...
* Don't allow 'wp' as a related word. Add the second word from the title
to it with a dash (e.g. the related words from wp_head are wp-head head).
* 'wp' is allowed as a related word if it's the only word in a
function/hook/class/method name.
* Allow the stop words 'is' and 'get' only if they're the first word of
the post title.
* Add word stems and the file name as related words.
Here you'll find related words found with these rules:
Functions
https://rawgit.com/keesiemeijer/b8ba0b01006d6d859919/raw/poc-related-
words-functions.html
Classes
https://rawgit.com/keesiemeijer/0727a611ee3d171a5ea0/raw/poc-related-
words-classes.html
Methods
https://rawgit.com/keesiemeijer/49d3d62068be351e7adb/raw/poc-related-
words-methods.html
Hooks
https://rawgit.com/keesiemeijer/7403f273deeb546389d8/raw/poc-related-
words-hooks.html
These results were created with this gist in the archive.php file of the
wporg-developer theme.
https://gist.github.com/keesiemeijer/41b6c8576a2f2ac684ce
I've created a custom taxonomy 'wp-parser-related-words' for all relevant
post types in my local install and added the related words to the posts as
terms.
In total 3,595 terms were created. This is with all external libraries
parsed. It should be less for the developer reference.
--
Ticket URL: <https://meta.trac.wordpress.org/ticket/174#comment:7>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org
More information about the wp-meta
mailing list