[wp-meta] [Making WordPress.org] #4450: Does WordPress.org Plugin Repo Elasticsearch function_score penalize plugins with fewer than one million installs?

Making WordPress.org noreply at wordpress.org
Fri May 10 03:22:30 UTC 2019


#4450: Does WordPress.org Plugin Repo Elasticsearch function_score penalize
plugins with fewer than one million installs?
------------------------------+---------------------
 Reporter:  jadonn            |       Owner:  (none)
     Type:  defect            |      Status:  new
 Priority:  normal            |   Milestone:
Component:  Plugin Directory  |  Resolution:
 Keywords:                    |
------------------------------+---------------------
Description changed by dd32:

Old description:

> I was recently looking over the
> [https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html
> /wp-content/plugins/plugin-directory/libs/site-search/jetpack-
> search.php#L1001 source code for the Plugin Repo's Elasticsearch
> function_score query]. If I understand correctly, it seems like the query
> penalizes plugins with less than one million active installs, but the
> comments in the code suggest this should be otherwise. The filter clause
> in the Elasticsearch query applies the exponential decay scoring function
> to plugins with less-than-or-equal to 1000000 active installs. The
> exponential decay scoring function with a plugin with 500000 (five
> hundred thousand) active installs should look like this when plugging in
> all the values in accordance with
> [https://www.elastic.co/guide/en/elasticsearch/reference/current/query-
> dsl-function-score-query.html#exp-decay Elasticsearch's example]
> :
>
> custom score = e^((ln(decay)/scale) * max(0, |actual_value - origin| -
> offset))
>
> decay = 0.75
> scale = 900000
> actual_value = 500000
> origin = 1000000
> offset = 0
>
> e^((ln(0.75)/900000) * max(0, |500000 - 1000000| - 0))^ = 0.8522943134
>
> For Google Sheets:
> EXP(LN(0.75)/900000 * MAX(0, ABS(500000 - 1000000) - 0)) = 0.8522943134
>
> The resulting score is multiplied, along with other calculated factors,
> with the document relevance score Elasticsearch returns based on how well
> the search input matches the plugin text content. If my understanding of
> the exponential decay function is correct and if my math is correct, it
> appears that the resulting relevance document score for the plugin is
> going to be reduced to 85% of what it should otherwise be. This
> multiplier is not calculated or applied to plugins with more than 1000000
> active installs.
>
> If I have misunderstood this query scoring, I would be grateful to have
> my understanding and my math corrected.

New description:

 I was recently looking over the
 [https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html
 /wp-content/plugins/plugin-directory/libs/site-search/jetpack-
 search.php#L1001 source code for the Plugin Repo's Elasticsearch
 function_score query]. If I understand correctly, it seems like the query
 penalizes plugins with less than one million active installs, but the
 comments in the code suggest this should be otherwise. The filter clause
 in the Elasticsearch query applies the exponential decay scoring function
 to plugins with less-than-or-equal to 1000000 active installs. The
 exponential decay scoring function with a plugin with 500000 (five hundred
 thousand) active installs should look like this when plugging in all the
 values in accordance with
 [https://www.elastic.co/guide/en/elasticsearch/reference/current/query-
 dsl-function-score-query.html#exp-decay Elasticsearch's example]:

 {{{
 custom score = e^((ln(decay)/scale) * max(0, |actual_value - origin| -
 offset))

 decay = 0.75
 scale = 900000
 actual_value = 500000
 origin = 1000000
 offset = 0

 e^((ln(0.75)/900000) * max(0, |500000 - 1000000| - 0))^ = 0.8522943134
 }}}

 For Google Sheets:
 {{{EXP(LN(0.75)/900000 * MAX(0, ABS(500000 - 1000000) - 0)) =
 0.8522943134}}}

 The resulting score is multiplied, along with other calculated factors,
 with the document relevance score Elasticsearch returns based on how well
 the search input matches the plugin text content. If my understanding of
 the exponential decay function is correct and if my math is correct, it
 appears that the resulting relevance document score for the plugin is
 going to be reduced to 85% of what it should otherwise be. This multiplier
 is not calculated or applied to plugins with more than 1000000 active
 installs.

 If I have misunderstood this query scoring, I would be grateful to have my
 understanding and my math corrected.

--

-- 
Ticket URL: <https://meta.trac.wordpress.org/ticket/4450#comment:3>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org


More information about the wp-meta mailing list