[wp-meta] [Making WordPress.org] #174: Link to generally related functions/classes

Making WordPress.org noreply at wordpress.org
Mon Jun 26 14:46:01 UTC 2017


#174: Link to generally related functions/classes
---------------------------+-----------------------
 Reporter:  samuelsidler   |       Owner:
     Type:  task           |      Status:  assigned
 Priority:  high           |   Milestone:
Component:  Developer Hub  |  Resolution:
 Keywords:  has-patch      |
---------------------------+-----------------------

Comment (by pbiron):

 I started work a while back on trying to identify "related" references,
 tho I've had to put it aside to work on other things recently.

 The general idea I was exploring is based on the realization that most
 (tho not all) function/method/class/hook names are of the form: `[Verb]
 [Noun]`, e.g., `(add|get|update|delete)_post_meta()`, etc...where `add,
 get, update, delete` are `Verbs` and `post_meta` is a `Noun`.

 So, on import:

 1. do phrase level parsing of function/method/class/hook names (stripping
 stopwords, but only limited stemming)
 1. do "part of speech" (POS) tagging of the phrases (see
 [[http://phpir.com/part-of-speech-tagging|Part Of Speech Tagging]])
 1. then, the "related" references are those with the same `Noun` but a
 different `Verb`

 Using this technique, I hope, will produce "related" references with a
 much higher degree of
 [[https://en.wikipedia.org/wiki/Precision_and_recall#Precision|Precision]]
 than stemming alone; altho the recall would undoubtedly be lower.
 Personally, getting 602 references "related" to `get_terms()` would be
 less than useful.

 Granted, the method I was working on requires **A LOT** of work up-front,
 building/refining the POS lexicon.  But once that up-front work is done,
 the indexing process is relatively quick (and doesn't require human
 input).

 I built a mostly fully functioning plugin that provides a UI for assigning
 POS to the phrases generated in step 1.  The plugin's intended use is:

 1. do an import from the sources (i.e., run `phpdoc-parser`), which
 generates potential phrases for step 1 above
 1. assign POS for each phrase (the plugin provides a UI that makes this
 pretty easy)
 1. iterate the process, refining the POS lexicon on each iteration

 I'll try to find the time to get the plugin to the point where I can
 release it and get others involved in refining the POS lexicon.

--
Ticket URL: <https://meta.trac.wordpress.org/ticket/174#comment:30>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org


More information about the wp-meta mailing list