[wp-trac] [WordPress Trac] #56294: WordPress search finds block name in comment

WordPress Trac noreply at wordpress.org
Thu Jun 27 21:11:12 UTC 2024


#56294: WordPress search finds block name in comment
--------------------------------------+--------------------------
 Reporter:  zodiac1978                |       Owner:  (none)
     Type:  enhancement               |      Status:  closed
 Priority:  normal                    |   Milestone:
Component:  Database                  |     Version:  5.0
 Severity:  normal                    |  Resolution:  maybelater
 Keywords:  needs-patch dev-feedback  |     Focuses:  performance
--------------------------------------+--------------------------

Comment (by dmsnell):

 > I just wasn't sure how much the impact is in real world examples.

 It would be good to benchmark, but I have a strong suspicion it's squarely
 in the //untenable// category. Frankly, the full-table scans involved
 purely in checking if a literal byte sequence are in a post is heavy
 enough.

 One of the perks of creating a new search index is the collapse of the
 data space required to search. The search results performance should scale
 with the diversity of content on a site, not on the number of posts or the
 lengths of those posts.

 For instance, one of the most basic things to consider is storing a lookup
 from search words to posts containing those search words. If we search for
 "apple" we shouldn't have to scan 1,000 posts looking for `apple` and
 running into all of the related problems reported in this ticket.

 Instead, we could have a lookup table containing every "word" (and I'm
 overlooking a lot of nuance when I say "word") in every post, and for each
 word, it lists the posts containing that word. We look up "apple" very
 rapidly and that may return a much smaller set of posts which we can then
 scan to perform the second and more refined search pass.

 > Do you think this would be possible for WordPress core to add such a new
 table column for searching? My thought was, that this would be plugin
 territory.

 If I were to start today I would create a separate feature plugin and
 explore how to do it in a Core way. It's likely that it will take a lot of
 work and refinement based on the feedback people provide. For instance, is
 it appropriate simply to build a better default search algorithm, or would
 it be more valuable to create a pluggable search indexing system that
 someone could plug Elasticsearch into, or their custom search backend?

 Maybe WordPress can develop the internal mechanisms needed for issuing
 content to the indexer and updating indexed objects while the actual
 indexing and retrieval becomes secondary.

 Based on a number of chats I've had with @zieladam about synchronizing
 multiple WordPress instances, I think the "vector clock" state-tracking
 table we've explored could be a bit fortuitous for search indexing. Search
 indexes are a system based on the need to keep external data in sync with
 the reference data.

 https://core.trac.wordpress.org/ticket/60375#comment:24

 [[Image(https://core.trac.wordpress.org/raw-
 attachment/ticket/60375/Sync%20Protocol%20Flow.drawio.png)]]

 > I dismissed this idea, because I never thought WordPress core would do
 it.

 Who is Core if not those of us using it and reporting issues and striving
 to make WordPress grow stronger?

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/56294#comment:22>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list