[wp-trac] [WordPress Trac] #61348: HTML API: Report real and virtual nodes in the HTML Processor.

WordPress Trac noreply at wordpress.org
Sun Jun 2 22:01:03 UTC 2024


#61348: HTML API: Report real and virtual nodes in the HTML Processor.
-------------------------+--------------------------------------
 Reporter:  dmsnell      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  6.6
Component:  HTML API     |    Version:  trunk
 Severity:  normal       |   Keywords:  has-patch has-unit-tests
  Focuses:               |
-------------------------+--------------------------------------
 HTML is a kind of short-hand for a DOM structure. This means that there
 are many cases in HTML where an element's opening tag or closing tag is
 missing (or both). This is because many of the parsing rules imply
 creating elements in the DOM which may not exist in the text of the HTML.

 The HTML Processor, being the higher-level counterpart to the Tag
 Processor, is already aware of these nodes, but since it's inception has
 not paused on them when scanning through a document. Instead, these are
 visible when pausing on a child of such an element, but otherwise not
 seen.

 The HTML Processor ought to fully represent the DOM structure a browser
 would see, which includes representing these "virtual" nodes which are
 implicitly created.

 ----

 For example, the HTML string `<p><div>Content</p></div>` looks like it
 contains overlapping `P` and `DIV` elements, but in reality the first `P`
 is implicitly closed by the `<div>` and the second `</p>` is unexpected
 and creates an empty `P` element.

 The current HTML Processor in `trunk` will visit these tags in sequence
 (where a `+` indicates opening a node while `-` indicates closing one):
 `+P +DIV #text -P -DIV`.

 The HTML Processor ought to represent this the way code traversing a DOM
 tree would: `+P -P +DIV #text +P -P -DIV`. Notably, in this sequence we
 can see the missing/implicit/virtual nodes that were created as part of
 applying the semantic HTML rules.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/61348>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list