[wp-trac] [WordPress Trac] #60122: HTML API: avoid processing incomplete tokens

WordPress Trac noreply at wordpress.org
Wed Dec 20 12:46:58 UTC 2023


#60122: HTML API: avoid processing incomplete tokens
-------------------------+--------------------------------------
 Reporter:  dmsnell      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  Awaiting Review
Component:  HTML API     |    Version:  trunk
 Severity:  normal       |   Keywords:  has-patch has-unit-tests
  Focuses:               |
-------------------------+--------------------------------------
 Currently the Tag Processor assumes that an input document is a full HTML
 document. Because of this, if there's lingering content after the last tag
 match it will treat that content as plaintext and skip over it. This is
 fine for the Tag Processor because if there is lingering content that
 isn't a valid tag then there's nothing for next_tag() to match.

 However, in order to support a number of feature expansions it is
 important to recognize that the remaining content may involve partial
 syntax elements, such as incomplete tags, attributes, or comments.

 In this patch we're adding a mode inside the Tag Processor which will flip
 when we start parsing HTML syntax but the document finishes before the
 token does. This will provide the ability to:

 extend the input document
 avoid misinterpreting syntax as text
 guess if we have a complete document, know if we have an incomplete
 document

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/60122>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list