[wp-meta] [Making WordPress.org] #4450: Does WordPress.org Plugin Repo Elasticsearch function_score penalize plugins with fewer than one million installs?
Making WordPress.org
noreply at wordpress.org
Thu May 9 18:51:08 UTC 2019
#4450: Does WordPress.org Plugin Repo Elasticsearch function_score penalize
plugins with fewer than one million installs?
------------------------------+--------------------
Reporter: jadonn | Owner: (none)
Type: defect | Status: new
Priority: normal | Milestone:
Component: Plugin Directory | Keywords:
------------------------------+--------------------
I was recently looking over the
[https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html
/wp-content/plugins/plugin-directory/libs/site-search/jetpack-
search.php#L1001 source code for the Plugin Repo's Elasticsearch
function_score query]. If I understand correctly, it seems like the query
penalizes plugins with less than one million active installs, but the
comments in the code suggest this should be otherwise. The filter clause
in the Elasticsearch query applies the exponential decay scoring function
to plugins with less-than-or-equal to 1000000 active installs. The
exponential decay scoring function with a plugin with 500000 (five hundred
thousand) active installs should look like this when plugging in all the
values in accordance with
[https://www.elastic.co/guide/en/elasticsearch/reference/current/query-
dsl-function-score-query.html#exp-decay Elasticsearch's example]
:
custom score = e^((ln(decay)/scale) * max(0, |actual_value - origin| -
offset))
decay = 0.75
scale = 900000
actual_value = 500000
origin = 1000000
offset = 0
e^((ln(0.75)/900000) * max(0, |500000 - 1000000| - 0))^ = 0.8522943134
For Google Sheets:
EXP(LN(0.75)/900000 * MAX(0, ABS(500000 - 1000000) - 0)) = 0.8522943134
The resulting score is multiplied, along with other calculated factors,
with the document relevance score Elasticsearch returns based on how well
the search input matches the plugin text content. If my understanding of
the exponential decay function is correct and if my math is correct, it
appears that the resulting relevance document score for the plugin is
going to be reduced to 85% of what it should otherwise be. This multiplier
is not calculated or applied to plugins with more than 1000000 active
installs.
If I have misunderstood this query scoring, I would be grateful to have my
understanding and my math corrected.
--
Ticket URL: <https://meta.trac.wordpress.org/ticket/4450>
Making WordPress.org <https://meta.trac.wordpress.org/>
Making WordPress.org
More information about the wp-meta
mailing list