[wp-trac] [WordPress Trac] #56294: WordPress search finds block name in comment
WordPress Trac
noreply at wordpress.org
Thu Jun 27 21:11:12 UTC 2024
#56294: WordPress search finds block name in comment
--------------------------------------+--------------------------
Reporter: zodiac1978 | Owner: (none)
Type: enhancement | Status: closed
Priority: normal | Milestone:
Component: Database | Version: 5.0
Severity: normal | Resolution: maybelater
Keywords: needs-patch dev-feedback | Focuses: performance
--------------------------------------+--------------------------
Comment (by dmsnell):
> I just wasn't sure how much the impact is in real world examples.
It would be good to benchmark, but I have a strong suspicion it's squarely
in the //untenable// category. Frankly, the full-table scans involved
purely in checking if a literal byte sequence are in a post is heavy
enough.
One of the perks of creating a new search index is the collapse of the
data space required to search. The search results performance should scale
with the diversity of content on a site, not on the number of posts or the
lengths of those posts.
For instance, one of the most basic things to consider is storing a lookup
from search words to posts containing those search words. If we search for
"apple" we shouldn't have to scan 1,000 posts looking for `apple` and
running into all of the related problems reported in this ticket.
Instead, we could have a lookup table containing every "word" (and I'm
overlooking a lot of nuance when I say "word") in every post, and for each
word, it lists the posts containing that word. We look up "apple" very
rapidly and that may return a much smaller set of posts which we can then
scan to perform the second and more refined search pass.
> Do you think this would be possible for WordPress core to add such a new
table column for searching? My thought was, that this would be plugin
territory.
If I were to start today I would create a separate feature plugin and
explore how to do it in a Core way. It's likely that it will take a lot of
work and refinement based on the feedback people provide. For instance, is
it appropriate simply to build a better default search algorithm, or would
it be more valuable to create a pluggable search indexing system that
someone could plug Elasticsearch into, or their custom search backend?
Maybe WordPress can develop the internal mechanisms needed for issuing
content to the indexer and updating indexed objects while the actual
indexing and retrieval becomes secondary.
Based on a number of chats I've had with @zieladam about synchronizing
multiple WordPress instances, I think the "vector clock" state-tracking
table we've explored could be a bit fortuitous for search indexing. Search
indexes are a system based on the need to keep external data in sync with
the reference data.
https://core.trac.wordpress.org/ticket/60375#comment:24
[[Image(https://core.trac.wordpress.org/raw-
attachment/ticket/60375/Sync%20Protocol%20Flow.drawio.png)]]
> I dismissed this idea, because I never thought WordPress core would do
it.
Who is Core if not those of us using it and reporting issues and striving
to make WordPress grow stronger?
--
Ticket URL: <https://core.trac.wordpress.org/ticket/56294#comment:22>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list