[wp-trac] [WordPress Trac] #61348: HTML API: Report real and virtual nodes in the HTML Processor.
WordPress Trac
noreply at wordpress.org
Sun Jun 2 22:01:03 UTC 2024
#61348: HTML API: Report real and virtual nodes in the HTML Processor.
-------------------------+--------------------------------------
Reporter: dmsnell | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: 6.6
Component: HTML API | Version: trunk
Severity: normal | Keywords: has-patch has-unit-tests
Focuses: |
-------------------------+--------------------------------------
HTML is a kind of short-hand for a DOM structure. This means that there
are many cases in HTML where an element's opening tag or closing tag is
missing (or both). This is because many of the parsing rules imply
creating elements in the DOM which may not exist in the text of the HTML.
The HTML Processor, being the higher-level counterpart to the Tag
Processor, is already aware of these nodes, but since it's inception has
not paused on them when scanning through a document. Instead, these are
visible when pausing on a child of such an element, but otherwise not
seen.
The HTML Processor ought to fully represent the DOM structure a browser
would see, which includes representing these "virtual" nodes which are
implicitly created.
----
For example, the HTML string `<p><div>Content</p></div>` looks like it
contains overlapping `P` and `DIV` elements, but in reality the first `P`
is implicitly closed by the `<div>` and the second `</p>` is unexpected
and creates an empty `P` element.
The current HTML Processor in `trunk` will visit these tags in sequence
(where a `+` indicates opening a node while `-` indicates closing one):
`+P +DIV #text -P -DIV`.
The HTML Processor ought to represent this the way code traversing a DOM
tree would: `+P -P +DIV #text +P -P -DIV`. Notably, in this sequence we
can see the missing/implicit/virtual nodes that were created as part of
applying the semantic HTML rules.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/61348>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list