<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[58779] trunk: HTML API: Add missing tags in IN BODY insertion mode to HTML Processor.</title>
</head>
<body>
<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; }
#msg dl a { font-weight: bold}
#msg dl a:link { color:#fc3; }
#msg dl a:active { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { white-space: pre-line; overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/58779">58779</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/58779","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>dmsnell</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2024-07-22 22:22:03 +0000 (Mon, 22 Jul 2024)</dd>
</dl>
<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>HTML API: Add missing tags in IN BODY insertion mode to HTML Processor.
As part of work to add more spec support to the HTML API, this patch adds
support for the remaining missing tags in the IN BODY insertion mode. Not
all of the added tags are supported, because in some cases they reset the
insertion mode and are reprocessed where they will be rejected.
This patch also improves the support of `get_modifiable_text()`, removing
a leading newline inside a LISTING, PRE, or TEXTAREA element.
Developed in https://github.com/WordPress/wordpress-develop/pull/6972
Discussed in https://core.trac.wordpress.org/ticket/61576
Props dmsnell, jonsurrell, westonruter.
See <a href="https://core.trac.wordpress.org/ticket/61576">#61576</a>.</pre>
<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmlactiveformattingelementsphp">trunk/src/wp-includes/html-api/class-wp-html-active-formatting-elements.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmlopenelementsphp">trunk/src/wp-includes/html-api/class-wp-html-open-elements.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmlprocessorstatephp">trunk/src/wp-includes/html-api/class-wp-html-processor-state.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmlprocessorphp">trunk/src/wp-includes/html-api/class-wp-html-processor.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp">trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmltokenphp">trunk/src/wp-includes/html-api/class-wp-html-token.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlProcessorphp">trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlProcessorBreadcrumbsphp">trunk/tests/phpunit/tests/html-api/wpHtmlProcessorBreadcrumbs.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlProcessorHtml5libphp">trunk/tests/phpunit/tests/html-api/wpHtmlProcessorHtml5lib.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlProcessorSemanticRulesphp">trunk/tests/phpunit/tests/html-api/wpHtmlProcessorSemanticRules.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlSupportRequiredHtmlProcessorphp">trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredHtmlProcessor.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlSupportRequiredOpenElementsphp">trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredOpenElements.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlTagProcessortokenscanningphp">trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php</a></li>
</ul>
<h3>Added Paths</h3>
<ul>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlTagProcessorModifiableTextphp">trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php</a></li>
</ul>
</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpincludeshtmlapiclasswphtmlactiveformattingelementsphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-active-formatting-elements.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-active-formatting-elements.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-active-formatting-elements.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -87,6 +87,22 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Inserts a "marker" at the end of the list of active formatting elements.
+ *
+ * > The markers are inserted when entering applet, object, marquee,
+ * > template, td, th, and caption elements, and are used to prevent
+ * > formatting from "leaking" into applet, object, marquee, template,
+ * > td, th, and caption elements.
+ *
+ * @see https://html.spec.whatwg.org/#concept-parser-marker
+ *
+ * @since 6.7.0
+ */
+ public function insert_marker(): void {
+ $this->push( new WP_HTML_Token( null, 'marker', false ) );
+ }
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Pushes a node onto the stack of active formatting elements.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -184,4 +200,30 @@
</span><span class="cx" style="display: block; padding: 0 10px"> yield $this->stack[ $i ];
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+ /**
+ * Clears the list of active formatting elements up to the last marker.
+ *
+ * > When the steps below require the UA to clear the list of active formatting elements up to
+ * > the last marker, the UA must perform the following steps:
+ * >
+ * > 1. Let entry be the last (most recently added) entry in the list of active
+ * > formatting elements.
+ * > 2. Remove entry from the list of active formatting elements.
+ * > 3. If entry was a marker, then stop the algorithm at this point.
+ * > The list has been cleared up to the last marker.
+ * > 4. Go to step 1.
+ *
+ * @see https://html.spec.whatwg.org/multipage/parsing.html#clear-the-list-of-active-formatting-elements-up-to-the-last-marker
+ *
+ * @since 6.7.0
+ */
+ public function clear_up_to_last_marker(): void {
+ foreach ( $this->walk_up() as $item ) {
+ array_pop( $this->stack );
+ if ( 'marker' === $item->node_name ) {
+ break;
+ }
+ }
+ }
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmlopenelementsphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-open-elements.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-open-elements.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-open-elements.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -102,6 +102,49 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Returns the name of the node at the nth position on the stack
+ * of open elements, or `null` if no such position exists.
+ *
+ * Note that this uses a 1-based index, which represents the
+ * "nth item" on the stack, counting from the top, where the
+ * top-most element is the 1st, the second is the 2nd, etc...
+ *
+ * @since 6.7.0
+ *
+ * @param int $nth Retrieve the nth item on the stack, with 1 being
+ * the top element, 2 being the second, etc...
+ * @return string|null Name of the node on the stack at the given location,
+ * or `null` if the location isn't on the stack.
+ */
+ public function at( int $nth ): ?string {
+ foreach ( $this->walk_down() as $item ) {
+ if ( 0 === --$nth ) {
+ return $item->node_name;
+ }
+ }
+
+ return null;
+ }
+
+ /**
+ * Reports if a node of a given name is in the stack of open elements.
+ *
+ * @since 6.7.0
+ *
+ * @param string $node_name Name of node for which to check.
+ * @return bool Whether a node of the given name is in the stack of open elements.
+ */
+ public function contains( string $node_name ): bool {
+ foreach ( $this->walk_up() as $item ) {
+ if ( $node_name === $item->node_name ) {
+ return true;
+ }
+ }
+
+ return false;
+ }
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Reports if a specific node is in the stack of open elements.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -111,7 +154,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function contains_node( WP_HTML_Token $token ): bool {
</span><span class="cx" style="display: block; padding: 0 10px"> foreach ( $this->walk_up() as $item ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- if ( $token->bookmark_name === $item->bookmark_name ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( $token === $item ) {
</ins><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -210,11 +253,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- switch ( $node->node_name ) {
- case 'HTML':
- return false;
- }
-
</del><span class="cx" style="display: block; padding: 0 10px"> if ( in_array( $node->node_name, $termination_list, true ) ) {
</span><span class="cx" style="display: block; padding: 0 10px"> return false;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -226,7 +264,31 @@
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px"> * Returns whether a particular element is in scope.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > The stack of open elements is said to have a particular element in
+ * > scope when it has that element in the specific scope consisting of
+ * > the following element types:
+ * >
+ * > - applet
+ * > - caption
+ * > - html
+ * > - table
+ * > - td
+ * > - th
+ * > - marquee
+ * > - object
+ * > - template
+ * > - MathML mi
+ * > - MathML mo
+ * > - MathML mn
+ * > - MathML ms
+ * > - MathML mtext
+ * > - MathML annotation-xml
+ * > - SVG foreignObject
+ * > - SVG desc
+ * > - SVG title
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Supports all required HTML elements.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @see https://html.spec.whatwg.org/#has-an-element-in-scope
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -237,14 +299,16 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return $this->has_element_in_specific_scope(
</span><span class="cx" style="display: block; padding: 0 10px"> $tag_name,
</span><span class="cx" style="display: block; padding: 0 10px"> array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-
- /*
- * Because it's not currently possible to encounter
- * one of the termination elements, they don't need
- * to be listed here. If they were, they would be
- * unreachable and only waste CPU cycles while
- * scanning through HTML.
- */
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'APPLET',
+ 'CAPTION',
+ 'HTML',
+ 'TABLE',
+ 'TD',
+ 'TH',
+ 'MARQUEE',
+ 'OBJECT',
+ 'TEMPLATE',
+ // @todo: Support SVG and MathML nodes when support for foreign content is added.
</ins><span class="cx" style="display: block; padding: 0 10px"> )
</span><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -252,8 +316,17 @@
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px"> * Returns whether a particular element is in list item scope.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > The stack of open elements is said to have a particular element
+ * > in list item scope when it has that element in the specific scope
+ * > consisting of the following element types:
+ * >
+ * > - All the element types listed above for the has an element in scope algorithm.
+ * > - ol in the HTML namespace
+ * > - ul in the HTML namespace
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.5.0 Implemented: no longer throws on every invocation.
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Supports all required HTML elements.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @see https://html.spec.whatwg.org/#has-an-element-in-list-item-scope
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -264,9 +337,19 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return $this->has_element_in_specific_scope(
</span><span class="cx" style="display: block; padding: 0 10px"> $tag_name,
</span><span class="cx" style="display: block; padding: 0 10px"> array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // There are more elements that belong here which aren't currently supported.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'APPLET',
+ 'BUTTON',
+ 'CAPTION',
+ 'HTML',
+ 'TABLE',
+ 'TD',
+ 'TH',
+ 'MARQUEE',
+ 'OBJECT',
</ins><span class="cx" style="display: block; padding: 0 10px"> 'OL',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'TEMPLATE',
</ins><span class="cx" style="display: block; padding: 0 10px"> 'UL',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ // @todo: Support SVG and MathML nodes when support for foreign content is added.
</ins><span class="cx" style="display: block; padding: 0 10px"> )
</span><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -274,7 +357,15 @@
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px"> * Returns whether a particular element is in button scope.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > The stack of open elements is said to have a particular element
+ * > in button scope when it has that element in the specific scope
+ * > consisting of the following element types:
+ * >
+ * > - All the element types listed above for the has an element in scope algorithm.
+ * > - button in the HTML namespace
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Supports all required HTML elements.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @see https://html.spec.whatwg.org/#has-an-element-in-button-scope
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -282,25 +373,52 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @return bool Whether given element is in scope.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function has_element_in_button_scope( string $tag_name ): bool {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- return $this->has_element_in_specific_scope( $tag_name, array( 'BUTTON' ) );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ return $this->has_element_in_specific_scope(
+ $tag_name,
+ array(
+ 'APPLET',
+ 'BUTTON',
+ 'CAPTION',
+ 'HTML',
+ 'TABLE',
+ 'TD',
+ 'TH',
+ 'MARQUEE',
+ 'OBJECT',
+ 'TEMPLATE',
+ // @todo: Support SVG and MathML nodes when support for foreign content is added.
+ )
+ );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="cx" style="display: block; padding: 0 10px"> * Returns whether a particular element is in table scope.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > The stack of open elements is said to have a particular element
+ * > in table scope when it has that element in the specific scope
+ * > consisting of the following element types:
+ * >
+ * > - html in the HTML namespace
+ * > - table in the HTML namespace
+ * > - template in the HTML namespace
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Full implementation.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @see https://html.spec.whatwg.org/#has-an-element-in-table-scope
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * @throws WP_HTML_Unsupported_Exception Always until this function is implemented.
- *
</del><span class="cx" style="display: block; padding: 0 10px"> * @param string $tag_name Name of tag to check.
</span><span class="cx" style="display: block; padding: 0 10px"> * @return bool Whether given element is in scope.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function has_element_in_table_scope( string $tag_name ): bool {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- throw new WP_HTML_Unsupported_Exception( 'Cannot process elements depending on table scope.' );
-
- return false; // The linter requires this unreachable code until the function is implemented and can return.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ return $this->has_element_in_specific_scope(
+ $tag_name,
+ array(
+ 'HTML',
+ 'TABLE',
+ 'TEMPLATE',
+ )
+ );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -540,7 +658,16 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * cases where the precalculated value needs to change.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> switch ( $item->node_name ) {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case 'APPLET':
</ins><span class="cx" style="display: block; padding: 0 10px"> case 'BUTTON':
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case 'CAPTION':
+ case 'HTML':
+ case 'TABLE':
+ case 'TD':
+ case 'TH':
+ case 'MARQUEE':
+ case 'OBJECT':
+ case 'TEMPLATE':
</ins><span class="cx" style="display: block; padding: 0 10px"> $this->has_p_in_button_scope = false;
</span><span class="cx" style="display: block; padding: 0 10px"> break;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -573,11 +700,17 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * cases where the precalculated value needs to change.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> switch ( $item->node_name ) {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case 'APPLET':
</ins><span class="cx" style="display: block; padding: 0 10px"> case 'BUTTON':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->has_p_in_button_scope = $this->has_element_in_button_scope( 'P' );
- break;
-
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case 'CAPTION':
+ case 'HTML':
</ins><span class="cx" style="display: block; padding: 0 10px"> case 'P':
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case 'TABLE':
+ case 'TD':
+ case 'TH':
+ case 'MARQUEE':
+ case 'OBJECT':
+ case 'TEMPLATE':
</ins><span class="cx" style="display: block; padding: 0 10px"> $this->has_p_in_button_scope = $this->has_element_in_button_scope( 'P' );
</span><span class="cx" style="display: block; padding: 0 10px"> break;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmlprocessorstatephp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-processor-state.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-processor-state.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-processor-state.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -312,6 +312,31 @@
</span><span class="cx" style="display: block; padding: 0 10px"> const INSERTION_MODE_IN_FOREIGN_CONTENT = 'insertion-mode-in-foreign-content';
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * No-quirks mode document compatability mode.
+ *
+ * > In no-quirks mode, the behavior is (hopefully) the desired behavior
+ * > described by the modern HTML and CSS specifications.
+ *
+ * @since 6.7.0
+ *
+ * @var string
+ */
+ const NO_QUIRKS_MODE = 'no-quirks-mode';
+
+ /**
+ * Quirks mode document compatability mode.
+ *
+ * > In quirks mode, layout emulates behavior in Navigator 4 and Internet
+ * > Explorer 5. This is essential in order to support websites that were
+ * > built before the widespread adoption of web standards.
+ *
+ * @since 6.7.0
+ *
+ * @var string
+ */
+ const QUIRKS_MODE = 'quirks-mode';
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * The stack of template insertion modes.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.7.0
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -369,6 +394,30 @@
</span><span class="cx" style="display: block; padding: 0 10px"> public $insertion_mode = self::INSERTION_MODE_INITIAL;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Indicates if the document is in quirks mode or no-quirks mode.
+ *
+ * Impact on HTML parsing:
+ *
+ * - In `NO_QUIRKS_MODE` CSS class and ID selectors match in a byte-for-byte
+ * manner, otherwise for backwards compatability, class selectors are to
+ * match in an ASCII case-insensitive manner.
+ *
+ * - When not in `QUIRKS_MODE`, a TABLE start tag implicitly closes an open P tag
+ * if one is in scope and open, otherwise the TABLE becomes a child of the P.
+ *
+ * `QUIRKS_MODE` impacts many styling-related aspects of an HTML document, but
+ * none of the other changes modifies how the HTML is parsed or selected.
+ *
+ * @see self::QUIRKS_MODE
+ * @see self::NO_QUIRKS_MODE
+ *
+ * @since 6.7.0
+ *
+ * @var string
+ */
+ public $document_mode = self::NO_QUIRKS_MODE;
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Context node initializing fragment parser, if created as a fragment parser.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -391,6 +440,24 @@
</span><span class="cx" style="display: block; padding: 0 10px"> public $head_element = null;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * FORM element pointer.
+ *
+ * > points to the last form element that was opened and whose end tag has
+ * > not yet been seen. It is used to make form controls associate with
+ * > forms in the face of dramatically bad markup, for historical reasons.
+ * > It is ignored inside template elements.
+ *
+ * @todo This may be invalidated by a seek operation.
+ *
+ * @see https://html.spec.whatwg.org/#form-element-pointer
+ *
+ * @since 6.7.0
+ *
+ * @var WP_HTML_Token|null
+ */
+ public $form_element = null;
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * The frameset-ok flag indicates if a `FRAMESET` element is allowed in the current state.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * > The frameset-ok flag is set to "ok" when the parser is created. It is set to "not ok" after certain tokens are seen.
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmlprocessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-processor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-processor.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-processor.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -97,22 +97,11 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * will abort early and stop all processing. This draconian measure ensures
</span><span class="cx" style="display: block; padding: 0 10px"> * that the HTML Processor won't break any HTML it doesn't fully understand.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * The following list specifies the HTML tags that _are_ supported:
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * The HTML Processor supports all elements other than a specific set:
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * - Containers: ADDRESS, BLOCKQUOTE, DETAILS, DIALOG, DIV, FOOTER, HEADER, MAIN, MENU, SPAN, SUMMARY.
- * - Custom elements: All custom elements are supported. :)
- * - Form elements: BUTTON, DATALIST, FIELDSET, INPUT, LABEL, LEGEND, METER, OPTGROUP, OPTION, PROGRESS, SEARCH, SELECT.
- * - Formatting elements: B, BIG, CODE, EM, FONT, I, PRE, SMALL, STRIKE, STRONG, TT, U, WBR.
- * - Heading elements: H1, H2, H3, H4, H5, H6, HGROUP.
- * - Links: A.
- * - Lists: DD, DL, DT, LI, OL, UL.
- * - Media elements: AUDIO, CANVAS, EMBED, FIGCAPTION, FIGURE, IMG, MAP, PICTURE, SOURCE, TRACK, VIDEO.
- * - Paragraph: BR, P.
- * - Phrasing elements: ABBR, AREA, BDI, BDO, CITE, DATA, DEL, DFN, INS, MARK, OUTPUT, Q, SAMP, SUB, SUP, TIME, VAR.
- * - Sectioning elements: ARTICLE, ASIDE, HR, NAV, SECTION.
- * - Templating elements: SLOT.
- * - Text decoration: RUBY.
- * - Deprecated elements: ACRONYM, BLINK, CENTER, DIR, ISINDEX, KEYGEN, LISTING, MULTICOL, NEXTID, PARAM, SPACER.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * - Any element inside a TABLE.
+ * - Any element inside foreign content, including SVG and MATH.
+ * - Any element outside the IN BODY insertion mode, e.g. doctype declarations, meta, links.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * ### Supported markup
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -121,16 +110,31 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * may in fact belong _before_ the table in the DOM. If the HTML Processor encounters
</span><span class="cx" style="display: block; padding: 0 10px"> * such a case it will stop processing.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * The following list specifies HTML markup that _is_ supported:
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * The following list illustrates some common examples of unexpected HTML inputs that
+ * the HTML Processor properly parses and represents:
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * - Markup involving only those tags listed above.
- * - Fully-balanced and non-overlapping tags.
- * - HTML with unexpected tag closers.
- * - Some unbalanced or overlapping tags.
- * - P tags after unclosed P tags.
- * - BUTTON tags after unclosed BUTTON tags.
- * - A tags after unclosed A tags that don't involve any active formatting elements.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * - HTML with optional tags omitted, e.g. `<p>one<p>two`.
+ * - HTML with unexpected tag closers, e.g. `<p>one </span> more</p>`.
+ * - Non-void tags with self-closing flag, e.g. `<div/>the DIV is still open.</div>`.
+ * - Heading elements which close open heading elements of another level, e.g. `<h1>Closed by </h2>`.
+ * - Elements containing text that looks like other tags but isn't, e.g. `<title>The <img> is plaintext</title>`.
+ * - SCRIPT and STYLE tags containing text that looks like HTML but isn't, e.g. `<script>document.write('<p>Hi</p>');</script>`.
+ * - SCRIPT content which has been escaped, e.g. `<script><!-- document.write('<script>console.log("hi")</script>') --></script>`.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * ### Unsupported Features
+ *
+ * This parser does not report parse errors.
+ *
+ * Normally, when additional HTML or BODY tags are encountered in a document, if there
+ * are any additional attributes on them that aren't found on the previous elements,
+ * the existing HTML and BODY elements adopt those missing attribute values. This
+ * parser does not add those additional attributes.
+ *
+ * In certain situations, elements are moved to a different part of the document in
+ * a process called "adoption" and "fostering." Because the nodes move to a location
+ * in the document that the parser had already processed, this parser does not support
+ * these situations and will bail.
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @see WP_HTML_Tag_Processor
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1104,15 +1108,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $op = "{$op_sigil}{$token_name}";
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> switch ( $op ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '#comment':
- case '#funky-comment':
- case '#presumptuous-tag':
- $this->insert_html_element( $this->state->current_token );
- return true;
-
</del><span class="cx" style="display: block; padding: 0 10px"> case '#text':
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->reconstruct_active_formatting_elements();
-
</del><span class="cx" style="display: block; padding: 0 10px"> $current_token = $this->bookmarks[ $this->state->current_token->bookmark_name ];
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1133,6 +1129,8 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return $this->step();
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->reconstruct_active_formatting_elements();
+
</ins><span class="cx" style="display: block; padding: 0 10px"> /*
</span><span class="cx" style="display: block; padding: 0 10px"> * Whitespace-only text does not affect the frameset-ok flag.
</span><span class="cx" style="display: block; padding: 0 10px"> * It is probably inter-element whitespace, but it may also
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1146,30 +1144,147 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '#comment':
+ case '#funky-comment':
+ case '#presumptuous-tag':
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
+ * > A DOCTYPE token
+ * > Parse error. Ignore the token.
+ */
</ins><span class="cx" style="display: block; padding: 0 10px"> case 'html':
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ return $this->step();
+
+ /*
+ * > A start tag whose tag name is "html"
+ */
+ case '+HTML':
+ if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
+ /*
+ * > Otherwise, for each attribute on the token, check to see if the attribute
+ * > is already present on the top element of the stack of open elements. If
+ * > it is not, add the attribute and its corresponding value to that element.
+ *
+ * This parser does not currently support this behavior: ignore the token.
+ */
+ }
+
+ // Ignore the token.
+ return $this->step();
+
+ /*
+ * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
+ * > "meta", "noframes", "script", "style", "template", "title"
+ * >
+ * > An end tag whose tag name is "template"
+ */
+ case '+BASE':
+ case '+BASEFONT':
+ case '+BGSOUND':
+ case '+LINK':
+ case '+META':
+ case '+NOFRAMES':
+ case '+SCRIPT':
+ case '+STYLE':
+ case '+TEMPLATE':
+ case '+TITLE':
+ case '-TEMPLATE':
+ return $this->step_in_head();
+
+ /*
+ * > A start tag whose tag name is "body"
+ *
+ * This tag in the IN BODY insertion mode is a parse error.
+ */
+ case '+BODY':
+ if (
+ 1 === $this->state->stack_of_open_elements->count() ||
+ 'BODY' !== $this->state->stack_of_open_elements->at( 2 ) ||
+ $this->state->stack_of_open_elements->contains( 'TEMPLATE' )
+ ) {
+ // Ignore the token.
+ return $this->step();
+ }
+
</ins><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > A DOCTYPE token
- * > Parse error. Ignore the token.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
+ * > on the token, check to see if the attribute is already present on the body
+ * > element (the second element) on the stack of open elements, and if it is
+ * > not, add the attribute and its corresponding value to that element.
+ *
+ * This parser does not currently support this behavior: ignore the token.
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->state->frameset_ok = false;
</ins><span class="cx" style="display: block; padding: 0 10px"> return $this->step();
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > A start tag whose tag name is "button"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "frameset"
+ *
+ * This tag in the IN BODY insertion mode is a parse error.
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '+BUTTON':
- if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
- // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
- $this->generate_implied_end_tags();
- $this->state->stack_of_open_elements->pop_until( 'BUTTON' );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '+FRAMESET':
+ if (
+ 1 === $this->state->stack_of_open_elements->count() ||
+ 'BODY' !== $this->state->stack_of_open_elements->at( 2 ) ||
+ false === $this->state->frameset_ok
+ ) {
+ // Ignore the token.
+ return $this->step();
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->reconstruct_active_formatting_elements();
- $this->insert_html_element( $this->state->current_token );
- $this->state->frameset_ok = false;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * > Otherwise, run the following steps:
+ */
+ $this->bail( 'Cannot process non-ignored FRAMESET tags.' );
+ break;
</ins><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * > An end tag whose tag name is "body"
+ */
+ case '-BODY':
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
+ // Parse error: ignore the token.
+ return $this->step();
+ }
+
+ /*
+ * > Otherwise, if there is a node in the stack of open elements that is not either a
+ * > dd element, a dt element, an li element, an optgroup element, an option element,
+ * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
+ * > element, a td element, a tfoot element, a th element, a thread element, a tr
+ * > element, the body element, or the html element, then this is a parse error.
+ *
+ * There is nothing to do for this parse error, so don't check for it.
+ */
+
+ $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
</ins><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > An end tag whose tag name is "html"
+ */
+ case '-HTML':
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
+ // Parse error: ignore the token.
+ return $this->step();
+ }
+
+ /*
+ * > Otherwise, if there is a node in the stack of open elements that is not either a
+ * > dd element, a dt element, an li element, an optgroup element, an option element,
+ * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
+ * > element, a td element, a tfoot element, a th element, a thread element, a tr
+ * > element, the body element, or the html element, then this is a parse error.
+ *
+ * There is nothing to do for this parse error, so don't check for it.
+ */
+
+ $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
+ return $this->step( self::REPROCESS_CURRENT_NODE );
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > A start tag whose tag name is one of: "address", "article", "aside",
</span><span class="cx" style="display: block; padding: 0 10px"> * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
</span><span class="cx" style="display: block; padding: 0 10px"> * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1208,52 +1323,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
- * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
- * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
- * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
- */
- case '-ADDRESS':
- case '-ARTICLE':
- case '-ASIDE':
- case '-BLOCKQUOTE':
- case '-BUTTON':
- case '-CENTER':
- case '-DETAILS':
- case '-DIALOG':
- case '-DIR':
- case '-DIV':
- case '-DL':
- case '-FIELDSET':
- case '-FIGCAPTION':
- case '-FIGURE':
- case '-FOOTER':
- case '-HEADER':
- case '-HGROUP':
- case '-LISTING':
- case '-MAIN':
- case '-MENU':
- case '-NAV':
- case '-OL':
- case '-PRE':
- case '-SEARCH':
- case '-SECTION':
- case '-SUMMARY':
- case '-UL':
- if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
- // @todo Report parse error.
- // Ignore the token.
- return $this->step();
- }
-
- $this->generate_implied_end_tags();
- if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
- // @todo Record parse error: this error doesn't impact parsing.
- }
- $this->state->stack_of_open_elements->pop_until( $token_name );
- return true;
-
- /*
</del><span class="cx" style="display: block; padding: 0 10px"> * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> case '+H1':
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1288,35 +1357,39 @@
</span><span class="cx" style="display: block; padding: 0 10px"> if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
</span><span class="cx" style="display: block; padding: 0 10px"> $this->close_a_p_element();
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+ /*
+ * > If the next token is a U+000A LINE FEED (LF) character token,
+ * > then ignore that token and move on to the next one. (Newlines
+ * > at the start of pre blocks are ignored as an authoring convenience.)
+ *
+ * This is handled in `get_modifiable_text()`.
+ */
+
</ins><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><span class="cx" style="display: block; padding: 0 10px"> $this->state->frameset_ok = false;
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "form"
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '-H1':
- case '-H2':
- case '-H3':
- case '-H4':
- case '-H5':
- case '-H6':
- if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
- /*
- * This is a parse error; ignore the token.
- *
- * @todo Indicate a parse error once it's possible.
- */
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '+FORM':
+ $stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );
+
+ if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
+ // Parse error: ignore the token.
</ins><span class="cx" style="display: block; padding: 0 10px"> return $this->step();
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->generate_implied_end_tags();
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
+ $this->close_a_p_element();
+ }
</ins><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
- // @todo Record parse error: this error doesn't impact parsing.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->insert_html_element( $this->state->current_token );
+ if ( ! $stack_contains_template ) {
+ $this->state->form_element = $this->state->current_token;
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
</del><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1377,7 +1450,151 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '+PLAINTEXT':
+ if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
+ $this->close_a_p_element();
+ }
+
+ /*
+ * @todo This may need to be handled in the Tag Processor and turn into
+ * a single self-contained tag like TEXTAREA, whose modifiable text
+ * is the rest of the input document as plaintext.
+ */
+ $this->bail( 'Cannot process PLAINTEXT elements.' );
+ break;
+
</ins><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "button"
+ */
+ case '+BUTTON':
+ if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
+ // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
+ $this->generate_implied_end_tags();
+ $this->state->stack_of_open_elements->pop_until( 'BUTTON' );
+ }
+
+ $this->reconstruct_active_formatting_elements();
+ $this->insert_html_element( $this->state->current_token );
+ $this->state->frameset_ok = false;
+
+ return true;
+
+ /*
+ * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
+ * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
+ * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
+ * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
+ *
+ * @todo This needs to check if the element in scope is an HTML element, meaning that
+ * when SVG and MathML support is added, this needs to differentiate between an
+ * HTML element of the given name, such as `<center>`, and a foreign element of
+ * the same given name.
+ */
+ case '-ADDRESS':
+ case '-ARTICLE':
+ case '-ASIDE':
+ case '-BLOCKQUOTE':
+ case '-BUTTON':
+ case '-CENTER':
+ case '-DETAILS':
+ case '-DIALOG':
+ case '-DIR':
+ case '-DIV':
+ case '-DL':
+ case '-FIELDSET':
+ case '-FIGCAPTION':
+ case '-FIGURE':
+ case '-FOOTER':
+ case '-HEADER':
+ case '-HGROUP':
+ case '-LISTING':
+ case '-MAIN':
+ case '-MENU':
+ case '-NAV':
+ case '-OL':
+ case '-PRE':
+ case '-SEARCH':
+ case '-SECTION':
+ case '-SUMMARY':
+ case '-UL':
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
+ // @todo Report parse error.
+ // Ignore the token.
+ return $this->step();
+ }
+
+ $this->generate_implied_end_tags();
+ if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
+ // @todo Record parse error: this error doesn't impact parsing.
+ }
+ $this->state->stack_of_open_elements->pop_until( $token_name );
+ return true;
+
+ /*
+ * > An end tag whose tag name is "form"
+ */
+ case '-FORM':
+ if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
+ $node = $this->state->form_element;
+ $this->state->form_element = null;
+
+ /*
+ * > If node is null or if the stack of open elements does not have node
+ * > in scope, then this is a parse error; return and ignore the token.
+ *
+ * @todo It's necessary to check if the form token itself is in scope, not
+ * simply whether any FORM is in scope.
+ */
+ if (
+ null === $node ||
+ ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
+ ) {
+ // Parse error: ignore the token.
+ return $this->step();
+ }
+
+ $this->generate_implied_end_tags();
+ if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
+ // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
+ $this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
+ }
+
+ $this->state->stack_of_open_elements->remove_node( $node );
+ } else {
+ /*
+ * > If the stack of open elements does not have a form element in scope,
+ * > then this is a parse error; return and ignore the token.
+ *
+ * Note that unlike in the clause above, this is checking for any FORM in scope.
+ */
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
+ // Parse error: ignore the token.
+ return $this->step();
+ }
+
+ $this->generate_implied_end_tags();
+
+ if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
+ // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
+ }
+
+ $this->state->stack_of_open_elements->pop_until( 'FORM' );
+ return true;
+ }
+ break;
+
+ /*
+ * > An end tag whose tag name is "p"
+ */
+ case '-P':
+ if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
+ $this->insert_html_element( $this->state->current_token );
+ }
+
+ $this->close_a_p_element();
+ return true;
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > An end tag whose tag name is "li"
</span><span class="cx" style="display: block; padding: 0 10px"> * > An end tag whose tag name is one of: "dd", "dt"
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1423,17 +1640,35 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > An end tag whose tag name is "p"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '-P':
- if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
- $this->insert_html_element( $this->state->current_token );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '-H1':
+ case '-H2':
+ case '-H3':
+ case '-H4':
+ case '-H5':
+ case '-H6':
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
+ /*
+ * This is a parse error; ignore the token.
+ *
+ * @todo Indicate a parse error once it's possible.
+ */
+ return $this->step();
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->close_a_p_element();
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->generate_implied_end_tags();
+
+ if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
+ // @todo Record parse error: this error doesn't impact parsing.
+ }
+
+ $this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
</ins><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // > A start tag whose tag name is "a"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * > A start tag whose tag name is "a"
+ */
</ins><span class="cx" style="display: block; padding: 0 10px"> case '+A':
</span><span class="cx" style="display: block; padding: 0 10px"> foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
</span><span class="cx" style="display: block; padding: 0 10px"> switch ( $item->node_name ) {
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1475,6 +1710,22 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "nobr"
+ */
+ case '+NOBR':
+ $this->reconstruct_active_formatting_elements();
+
+ if ( $this->state->stack_of_open_elements->has_element_in_scope( 'NOBR' ) ) {
+ // Parse error.
+ $this->run_adoption_agency_algorithm();
+ $this->reconstruct_active_formatting_elements();
+ }
+
+ $this->insert_html_element( $this->state->current_token );
+ $this->state->active_formatting_elements->push( $this->state->current_token );
+ return true;
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > An end tag whose tag name is one of: "a", "b", "big", "code", "em", "font", "i",
</span><span class="cx" style="display: block; padding: 0 10px"> * > "nobr", "s", "small", "strike", "strong", "tt", "u"
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1495,15 +1746,64 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is one of: "applet", "marquee", "object"
+ */
+ case '+APPLET':
+ case '+MARQUEE':
+ case '+OBJECT':
+ $this->reconstruct_active_formatting_elements();
+ $this->insert_html_element( $this->state->current_token );
+ $this->state->active_formatting_elements->insert_marker();
+ $this->state->frameset_ok = false;
+ return true;
+
+ /*
+ * > A end tag token whose tag name is one of: "applet", "marquee", "object"
+ *
+ * @todo This needs to check if the element in scope is an HTML element, meaning that
+ * when SVG and MathML support is added, this needs to differentiate between an
+ * HTML element of the given name, such as `<object>`, and a foreign element of
+ * the same given name.
+ */
+ case '-APPLET':
+ case '-MARQUEE':
+ case '-OBJECT':
+ if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
+ // Parse error: ignore the token.
+ return $this->step();
+ }
+
+ $this->generate_implied_end_tags();
+ if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
+ // This is a parse error.
+ }
+
+ $this->state->stack_of_open_elements->pop_until( $token_name );
+ $this->state->active_formatting_elements->clear_up_to_last_marker();
+ return true;
+
+ /*
+ * > A start tag whose tag name is "table"
+ */
+ case '+TABLE':
+ if (
+ WP_HTML_Processor_State::QUIRKS_MODE !== $this->state->document_mode &&
+ $this->state->stack_of_open_elements->has_p_in_button_scope()
+ ) {
+ $this->close_a_p_element();
+ }
+
+ $this->insert_html_element( $this->state->current_token );
+ $this->state->frameset_ok = false;
+ $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
+ return true;
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > An end tag whose tag name is "br"
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > Parse error. Drop the attributes from the token, and act as described in the next
- * > entry; i.e. act as if this was a "br" start tag token with no attributes, rather
- * > than the end tag token that it actually is.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ *
+ * This is prevented from happening because the Tag Processor
+ * reports all closing BR tags as if they were opening tags.
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '-BR':
- $this->bail( 'Closing BR tags require unimplemented special handling.' );
- // This return required because PHPCS can't determine that the call to bail() throws.
- return false;
</del><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><span class="cx" style="display: block; padding: 0 10px"> * > A start tag whose tag name is one of: "area", "br", "embed", "img", "keygen", "wbr"
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1525,18 +1825,29 @@
</span><span class="cx" style="display: block; padding: 0 10px"> case '+INPUT':
</span><span class="cx" style="display: block; padding: 0 10px"> $this->reconstruct_active_formatting_elements();
</span><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $type_attribute = $this->get_attribute( 'type' );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
</ins><span class="cx" style="display: block; padding: 0 10px"> /*
</span><span class="cx" style="display: block; padding: 0 10px"> * > If the token does not have an attribute with the name "type", or if it does,
</span><span class="cx" style="display: block; padding: 0 10px"> * > but that attribute's value is not an ASCII case-insensitive match for the
</span><span class="cx" style="display: block; padding: 0 10px"> * > string "hidden", then: set the frameset-ok flag to "not ok".
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $type_attribute = $this->get_attribute( 'type' );
</ins><span class="cx" style="display: block; padding: 0 10px"> if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
</span><span class="cx" style="display: block; padding: 0 10px"> $this->state->frameset_ok = false;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
</ins><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is one of: "param", "source", "track"
+ */
+ case '+PARAM':
+ case '+SOURCE':
+ case '+TRACK':
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > A start tag whose tag name is "hr"
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> case '+HR':
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1548,15 +1859,84 @@
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * > A start tag whose tag name is one of: "param", "source", "track"
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "image"
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- case '+PARAM':
- case '+SOURCE':
- case '+TRACK':
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case '+IMAGE':
+ /*
+ * > Parse error. Change the token's tag name to "img" and reprocess it. (Don't ask.)
+ *
+ * Note that this is handled elsewhere, so it should not be possible to reach this code.
+ */
+ $this->bail( "Cannot process an IMAGE tag. (Don't ask.)" );
+ break;
+
+ /*
+ * > A start tag whose tag name is "textarea"
+ */
+ case '+TEXTAREA':
</ins><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+
+ /*
+ * > If the next token is a U+000A LINE FEED (LF) character token, then ignore
+ * > that token and move on to the next one. (Newlines at the start of
+ * > textarea elements are ignored as an authoring convenience.)
+ *
+ * This is handled in `get_modifiable_text()`.
+ */
+
+ $this->state->frameset_ok = false;
+
+ /*
+ * > Switch the insertion mode to "text".
+ *
+ * As a self-contained node, this behavior is handled in the Tag Processor.
+ */
</ins><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * > A start tag whose tag name is "xmp"
+ */
+ case '+XMP':
+ if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
+ $this->close_a_p_element();
+ }
+
+ $this->reconstruct_active_formatting_elements();
+ $this->state->frameset_ok = false;
+
+ /*
+ * > Follow the generic raw text element parsing algorithm.
+ *
+ * As a self-contained node, this behavior is handled in the Tag Processor.
+ */
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
+ * A start tag whose tag name is "iframe"
+ */
+ case '+IFRAME':
+ $this->state->frameset_ok = false;
+
+ /*
+ * > Follow the generic raw text element parsing algorithm.
+ *
+ * As a self-contained node, this behavior is handled in the Tag Processor.
+ */
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
+ * > A start tag whose tag name is "noembed"
+ * > A start tag whose tag name is "noscript", if the scripting flag is enabled
+ *
+ * The scripting flag is never enabled in this parser.
+ */
+ case '+NOEMBED':
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
</ins><span class="cx" style="display: block; padding: 0 10px"> * > A start tag whose tag name is "select"
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> case '+SELECT':
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1597,69 +1977,89 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $this->reconstruct_active_formatting_elements();
</span><span class="cx" style="display: block; padding: 0 10px"> $this->insert_html_element( $this->state->current_token );
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- }
</del><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- /*
- * These tags require special handling in the 'in body' insertion mode
- * but that handling hasn't yet been implemented.
- *
- * As the rules for each tag are implemented, the corresponding tag
- * name should be removed from this list. An accompanying test should
- * help ensure this list is maintained.
- *
- * @see Tests_HtmlApi_WpHtmlProcessor::test_step_in_body_fails_on_unsupported_tags
- *
- * Since this switch structure throws a WP_HTML_Unsupported_Exception, it's
- * possible to handle "any other start tag" and "any other end tag" below,
- * as that guarantees execution doesn't proceed for the unimplemented tags.
- *
- * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody
- */
- switch ( $token_name ) {
- case 'APPLET':
- case 'BASE':
- case 'BASEFONT':
- case 'BGSOUND':
- case 'BODY':
- case 'CAPTION':
- case 'COL':
- case 'COLGROUP':
- case 'FORM':
- case 'FRAME':
- case 'FRAMESET':
- case 'HEAD':
- case 'HTML':
- case 'IFRAME':
- case 'LINK':
- case 'MARQUEE':
- case 'MATH':
- case 'META':
- case 'NOBR':
- case 'NOEMBED':
- case 'NOFRAMES':
- case 'NOSCRIPT':
- case 'OBJECT':
- case 'PLAINTEXT':
- case 'RB':
- case 'RP':
- case 'RT':
- case 'RTC':
- case 'SARCASM':
- case 'SCRIPT':
- case 'STYLE':
- case 'SVG':
- case 'TABLE':
- case 'TBODY':
- case 'TD':
- case 'TEMPLATE':
- case 'TEXTAREA':
- case 'TFOOT':
- case 'TH':
- case 'THEAD':
- case 'TITLE':
- case 'TR':
- case 'XMP':
- $this->bail( "Cannot process {$token_name} element." );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * > A start tag whose tag name is one of: "rb", "rtc"
+ */
+ case '+RB':
+ case '+RTC':
+ if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
+ $this->generate_implied_end_tags();
+
+ if ( $this->state->stack_of_open_elements->current_node_is( 'RUBY' ) ) {
+ // @todo Indicate a parse error once it's possible.
+ }
+ }
+
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
+ * > A start tag whose tag name is one of: "rp", "rt"
+ */
+ case '+RP':
+ case '+RT':
+ if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
+ $this->generate_implied_end_tags( 'RTC' );
+
+ $current_node_name = $this->state->stack_of_open_elements->current_node()->node_name;
+ if ( 'RTC' === $current_node_name || 'RUBY' === $current_node_name ) {
+ // @todo Indicate a parse error once it's possible.
+ }
+ }
+
+ $this->insert_html_element( $this->state->current_token );
+ return true;
+
+ /*
+ * > A start tag whose tag name is "math"
+ */
+ case '+MATH':
+ $this->reconstruct_active_formatting_elements();
+
+ /*
+ * @todo Adjust MathML attributes for the token. (This fixes the case of MathML attributes that are not all lowercase.)
+ * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink.)
+ *
+ * These ought to be handled in the attribute methods.
+ */
+
+ $this->bail( 'Cannot process MATH element, opening foreign content.' );
+ break;
+
+ /*
+ * > A start tag whose tag name is "svg"
+ */
+ case '+SVG':
+ $this->reconstruct_active_formatting_elements();
+
+ /*
+ * @todo Adjust SVG attributes for the token. (This fixes the case of SVG attributes that are not all lowercase.)
+ * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink in SVG.)
+ *
+ * These ought to be handled in the attribute methods.
+ */
+
+ $this->bail( 'Cannot process SVG element, opening foreign content.' );
+ break;
+
+ /*
+ * > A start tag whose tag name is one of: "caption", "col", "colgroup",
+ * > "frame", "head", "tbody", "td", "tfoot", "th", "thead", "tr"
+ */
+ case '+CAPTION':
+ case '+COL':
+ case '+COLGROUP':
+ case '+FRAME':
+ case '+HEAD':
+ case '+TBODY':
+ case '+TD':
+ case '+TFOOT':
+ case '+TH':
+ case '+THEAD':
+ case '+TR':
+ // Parse error. Ignore the token.
+ return $this->step();
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> if ( ! parent::is_tag_closer() ) {
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1681,6 +2081,12 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * close anything beyond its containing `P` or `DIV` element.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * @todo This needs to check if the element in scope is an HTML element, meaning that
+ * when SVG and MathML support is added, this needs to differentiate between an
+ * HTML element of the given name, such as `<object>`, and a foreign element of
+ * the same given name.
+ */
</ins><span class="cx" style="display: block; padding: 0 10px"> if ( $token_name === $node->node_name ) {
</span><span class="cx" style="display: block; padding: 0 10px"> break;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -129,7 +129,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * $processor = new WP_HTML_Tag_Processor( '<style>// this is everything</style><div>' );
</span><span class="cx" style="display: block; padding: 0 10px"> * true === $processor->next_tag( 'DIV' );
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * #### Special elements
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * #### Special self-contained elements
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * Some HTML elements are handled in a special way; their start and end tags
</span><span class="cx" style="display: block; padding: 0 10px"> * act like a void tag. These are special because their contents can't contain
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -756,6 +756,20 @@
</span><span class="cx" style="display: block; padding: 0 10px"> protected $seek_count = 0;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Whether the parser should skip over an immediately-following linefeed
+ * character, as is the case with LISTING, PRE, and TEXTAREA.
+ *
+ * > If the next token is a U+000A LINE FEED (LF) character token, then
+ * > ignore that token and move on to the next one. (Newlines at the start
+ * > of [these] elements are ignored as an authoring convenience.)
+ *
+ * @since 6.7.0
+ *
+ * @var int|null
+ */
+ private $skip_newline_at = null;
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Constructor.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.2.0
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -926,20 +940,23 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $this->token_length = $this->bytes_already_parsed - $this->token_starts_at;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * For non-DATA sections which might contain text that looks like HTML tags but
- * isn't, scan with the appropriate alternative mode. Looking at the first letter
- * of the tag name as a pre-check avoids a string allocation when it's not needed.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Certain tags require additional processing. The first-letter pre-check
+ * avoids unnecessary string allocation when comparing the tag names.
+ *
+ * - IFRAME
+ * - LISTING (deprecated)
+ * - NOEMBED (deprecated)
+ * - NOFRAMES (deprecated)
+ * - PRE
+ * - SCRIPT
+ * - STYLE
+ * - TEXTAREA
+ * - TITLE
+ * - XMP (deprecated)
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $t = $this->html[ $this->tag_name_starts_at ];
</del><span class="cx" style="display: block; padding: 0 10px"> if (
</span><span class="cx" style="display: block; padding: 0 10px"> $this->is_closing_tag ||
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- ! (
- 'i' === $t || 'I' === $t ||
- 'n' === $t || 'N' === $t ||
- 's' === $t || 'S' === $t ||
- 't' === $t || 'T' === $t ||
- 'x' === $t || 'X' === $t
- )
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 1 !== strspn( $this->html, 'iIlLnNpPsStTxX', $this->tag_name_starts_at, 1 )
</ins><span class="cx" style="display: block; padding: 0 10px"> ) {
</span><span class="cx" style="display: block; padding: 0 10px"> return true;
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -947,6 +964,26 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $tag_name = $this->get_tag();
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * For LISTING, PRE, and TEXTAREA, the first linefeed of an immediately-following
+ * text node is ignored as an authoring convenience.
+ *
+ * @see static::skip_newline_at
+ */
+ if ( 'LISTING' === $tag_name || 'PRE' === $tag_name ) {
+ $this->skip_newline_at = $this->bytes_already_parsed;
+ return true;
+ }
+
+ /*
+ * There are certain elements whose children are not DATA but are instead
+ * RCDATA or RAWTEXT. These cannot contain other elements, and the contents
+ * are parsed as plaintext, with character references decoded in RCDATA but
+ * not in RAWTEXT.
+ *
+ * These elements are described here as "self-contained" or special atomic
+ * elements whose end tag is consumed with the opening tag, and they will
+ * contain modifiable text inside of them.
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * Preserve the opening tag pointers, as these will be overwritten
</span><span class="cx" style="display: block; padding: 0 10px"> * when finding the closing tag. They will be reset after finding
</span><span class="cx" style="display: block; padding: 0 10px"> * the closing to tag to point to the opening of the special atomic
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2690,6 +2727,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * $p->is_tag_closer() === true;
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.2.0
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Reports all BR tags as opening tags.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @return bool Whether the current tag is a tag closer.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2696,7 +2734,16 @@
</span><span class="cx" style="display: block; padding: 0 10px"> public function is_tag_closer(): bool {
</span><span class="cx" style="display: block; padding: 0 10px"> return (
</span><span class="cx" style="display: block; padding: 0 10px"> self::STATE_MATCHED_TAG === $this->parser_state &&
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->is_closing_tag
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->is_closing_tag &&
+
+ /*
+ * The BR tag can only exist as an opening tag. If something like `</br>`
+ * appears then the HTML parser will treat it as an opening tag with no
+ * attributes. The BR tag is unique in this way.
+ *
+ * @see https://html.spec.whatwg.org/#parsing-main-inbody
+ */
+ 'BR' !== $this->get_tag()
</ins><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2825,17 +2872,38 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * that a token has modifiable text, and a token with modifiable text may
</span><span class="cx" style="display: block; padding: 0 10px"> * have an empty string (e.g. a comment with no contents).
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Limitations:
+ *
+ * - This function will not strip the leading newline appropriately
+ * after seeking into a LISTING or PRE element. To ensure that the
+ * newline is treated properly, seek to the LISTING or PRE opening
+ * tag instead of to the first text node inside the element.
+ *
</ins><span class="cx" style="display: block; padding: 0 10px"> * @since 6.5.0
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 6.7.0 Replaces NULL bytes (U+0000) and newlines appropriately.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @return string
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function get_modifiable_text(): string {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- if ( null === $this->text_starts_at ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( null === $this->text_starts_at || 0 === $this->text_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px"> return '';
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> $text = substr( $this->html, $this->text_starts_at, $this->text_length );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * Pre-processing the input stream would normally happen before
+ * any parsing is done, but deferring it means it's possible to
+ * skip in most cases. When getting the modifiable text, however
+ * it's important to apply the pre-processing steps, which is
+ * normalizing newlines.
+ *
+ * @see https://html.spec.whatwg.org/#preprocessing-the-input-stream
+ * @see https://infra.spec.whatwg.org/#normalize-newlines
+ */
+ $text = str_replace( "\r\n", "\n", $text );
+ $text = str_replace( "\r", "\n", $text );
+
</ins><span class="cx" style="display: block; padding: 0 10px"> // Comment data is not decoded.
</span><span class="cx" style="display: block; padding: 0 10px"> if (
</span><span class="cx" style="display: block; padding: 0 10px"> self::STATE_CDATA_NODE === $this->parser_state ||
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2843,10 +2911,10 @@
</span><span class="cx" style="display: block; padding: 0 10px"> self::STATE_DOCTYPE === $this->parser_state ||
</span><span class="cx" style="display: block; padding: 0 10px"> self::STATE_FUNKY_COMMENT === $this->parser_state
</span><span class="cx" style="display: block; padding: 0 10px"> ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- return $text;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ return str_replace( "\x00", "\u{FFFD}", $text );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $tag_name = $this->get_tag();
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $tag_name = $this->get_token_name();
</ins><span class="cx" style="display: block; padding: 0 10px"> if (
</span><span class="cx" style="display: block; padding: 0 10px"> // Script data is not decoded.
</span><span class="cx" style="display: block; padding: 0 10px"> 'SCRIPT' === $tag_name ||
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2858,29 +2926,34 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'STYLE' === $tag_name ||
</span><span class="cx" style="display: block; padding: 0 10px"> 'XMP' === $tag_name
</span><span class="cx" style="display: block; padding: 0 10px"> ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- return $text;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ return str_replace( "\x00", "\u{FFFD}", $text );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> $decoded = WP_HTML_Decoder::decode_text_node( $text );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /*
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * TEXTAREA skips a leading newline, but this newline may appear not only as the
- * literal character `\n`, but also as a character reference, such as in the
- * following markup: `<textarea>
Content</textarea>`.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Skip the first line feed after LISTING, PRE, and TEXTAREA opening tags.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * For these cases it's important to first decode the text content before checking
- * for a leading newline and removing it.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Note that this first newline may come in the form of a character
+ * reference, such as `
`, and so it's important to perform
+ * this transformation only after decoding the raw text content.
</ins><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> if (
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- self::STATE_MATCHED_TAG === $this->parser_state &&
- 'TEXTAREA' === $tag_name &&
- strlen( $decoded ) > 0 &&
- "\n" === $decoded[0]
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ ( "\n" === ( $decoded[0] ?? '' ) ) &&
+ ( ( $this->skip_newline_at === $this->token_starts_at && '#text' === $tag_name ) || 'TEXTAREA' === $tag_name )
</ins><span class="cx" style="display: block; padding: 0 10px"> ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- return substr( $decoded, 1 );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $decoded = substr( $decoded, 1 );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- return $decoded;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /*
+ * Only in normative text nodes does the NULL byte (U+0000) get removed.
+ * In all other contexts it's replaced by the replacement character (U+FFFD)
+ * for security reasons (to avoid joining together strings that were safe
+ * when separated, but not when joined).
+ */
+ return '#text' === $tag_name
+ ? str_replace( "\x00", '', $decoded )
+ : str_replace( "\x00", "\u{FFFD}", $decoded );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmltokenphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-token.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-token.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/src/wp-includes/html-api/class-wp-html-token.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -72,12 +72,13 @@
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @since 6.4.0
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * @param string $bookmark_name Name of bookmark corresponding to location in HTML where token is found.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @param string|null $bookmark_name Name of bookmark corresponding to location in HTML where token is found,
+ * or `null` for markers and nodes without a bookmark.
</ins><span class="cx" style="display: block; padding: 0 10px"> * @param string $node_name Name of node token represents; if uppercase, an HTML element; if lowercase, a special value like "marker".
</span><span class="cx" style="display: block; padding: 0 10px"> * @param bool $has_self_closing_flag Whether the source token contains the self-closing flag, regardless of whether it's valid.
</span><span class="cx" style="display: block; padding: 0 10px"> * @param callable|null $on_destroy Optional. Function to call when destroying token, useful for releasing the bookmark.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- public function __construct( string $bookmark_name, string $node_name, bool $has_self_closing_flag, ?callable $on_destroy = null ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ public function __construct( ?string $bookmark_name, string $node_name, bool $has_self_closing_flag, ?callable $on_destroy = null ) {
</ins><span class="cx" style="display: block; padding: 0 10px"> $this->bookmark_name = $bookmark_name;
</span><span class="cx" style="display: block; padding: 0 10px"> $this->node_name = $node_name;
</span><span class="cx" style="display: block; padding: 0 10px"> $this->has_self_closing_flag = $has_self_closing_flag;
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlProcessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -134,7 +134,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Processor::step_in_body
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Processor::is_void
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * @dataProvider data_void_tags
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @dataProvider data_void_tags_not_ignored_in_body
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @param string $tag_name Name of void tag under test.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -250,7 +250,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'Text node' => array( 'Trombone' ),
</span><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- foreach ( self::data_void_tags() as $tag_name => $_name ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ foreach ( self::data_void_tags_not_ignored_in_body() as $tag_name => $_name ) {
</ins><span class="cx" style="display: block; padding: 0 10px"> $self_contained_nodes[ "Void elements ({$tag_name})" ] = array( "<{$tag_name}>" );
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -284,7 +284,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @ticket 60382
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * @dataProvider data_void_tags
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @dataProvider data_void_tags_not_ignored_in_body
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @param string $tag_name Name of void tag under test.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -319,17 +319,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> $processor->get_breadcrumbs(),
</span><span class="cx" style="display: block; padding: 0 10px"> 'Found incorrect nesting of first element.'
</span><span class="cx" style="display: block; padding: 0 10px"> );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-
- $this->assertTrue(
- $processor->next_token(),
- 'Should have found the DIV as the second tag.'
- );
-
- $this->assertSame(
- array( 'HTML', 'BODY', 'DIV' ),
- $processor->get_breadcrumbs(),
- "DIV should have been a sibling of the {$tag_name}."
- );
</del><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -358,6 +347,18 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Data provider.
+ *
+ * @return array[]
+ */
+ public static function data_void_tags_not_ignored_in_body() {
+ $all_void_tags = self::data_void_tags();
+ unset( $all_void_tags['COL'] );
+
+ return $all_void_tags;
+ }
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Ensures that special handling of unsupported tags is cleaned up
</span><span class="cx" style="display: block; padding: 0 10px"> * as handling is implemented. Otherwise there's risk of leaving special
</span><span class="cx" style="display: block; padding: 0 10px"> * handling (that is never reached) when tag handling is implemented.
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -383,49 +384,8 @@
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public static function data_unsupported_special_in_body_tags() {
</span><span class="cx" style="display: block; padding: 0 10px"> return array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'APPLET' => array( 'APPLET' ),
- 'BASE' => array( 'BASE' ),
- 'BASEFONT' => array( 'BASEFONT' ),
- 'BGSOUND' => array( 'BGSOUND' ),
- 'BODY' => array( 'BODY' ),
- 'CAPTION' => array( 'CAPTION' ),
- 'COL' => array( 'COL' ),
- 'COLGROUP' => array( 'COLGROUP' ),
- 'FORM' => array( 'FORM' ),
- 'FRAME' => array( 'FRAME' ),
- 'FRAMESET' => array( 'FRAMESET' ),
- 'HEAD' => array( 'HEAD' ),
- 'HTML' => array( 'HTML' ),
- 'IFRAME' => array( 'IFRAME' ),
- 'LINK' => array( 'LINK' ),
- 'MARQUEE' => array( 'MARQUEE' ),
- 'MATH' => array( 'MATH' ),
- 'META' => array( 'META' ),
- 'NOBR' => array( 'NOBR' ),
- 'NOEMBED' => array( 'NOEMBED' ),
- 'NOFRAMES' => array( 'NOFRAMES' ),
- 'NOSCRIPT' => array( 'NOSCRIPT' ),
- 'OBJECT' => array( 'OBJECT' ),
- 'PLAINTEXT' => array( 'PLAINTEXT' ),
- 'RB' => array( 'RB' ),
- 'RP' => array( 'RP' ),
- 'RT' => array( 'RT' ),
- 'RTC' => array( 'RTC' ),
- 'SARCASM' => array( 'SARCASM' ),
- 'SCRIPT' => array( 'SCRIPT' ),
- 'STYLE' => array( 'STYLE' ),
- 'SVG' => array( 'SVG' ),
- 'TABLE' => array( 'TABLE' ),
- 'TBODY' => array( 'TBODY' ),
- 'TD' => array( 'TD' ),
- 'TEMPLATE' => array( 'TEMPLATE' ),
- 'TEXTAREA' => array( 'TEXTAREA' ),
- 'TFOOT' => array( 'TFOOT' ),
- 'TH' => array( 'TH' ),
- 'THEAD' => array( 'THEAD' ),
- 'TITLE' => array( 'TITLE' ),
- 'TR' => array( 'TR' ),
- 'XMP' => array( 'XMP' ),
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'MATH' => array( 'MATH' ),
+ 'SVG' => array( 'SVG' ),
</ins><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlProcessorBreadcrumbsphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlProcessorBreadcrumbs.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlProcessorBreadcrumbs.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlProcessorBreadcrumbs.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -40,6 +40,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'ABBR',
</span><span class="cx" style="display: block; padding: 0 10px"> 'ACRONYM', // Neutralized.
</span><span class="cx" style="display: block; padding: 0 10px"> 'ADDRESS',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'APPLET', // Deprecated.
</ins><span class="cx" style="display: block; padding: 0 10px"> 'AREA',
</span><span class="cx" style="display: block; padding: 0 10px"> 'ARTICLE',
</span><span class="cx" style="display: block; padding: 0 10px"> 'ASIDE',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -72,6 +73,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'FIGCAPTION',
</span><span class="cx" style="display: block; padding: 0 10px"> 'FIGURE',
</span><span class="cx" style="display: block; padding: 0 10px"> 'FONT',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'FORM',
</ins><span class="cx" style="display: block; padding: 0 10px"> 'FOOTER',
</span><span class="cx" style="display: block; padding: 0 10px"> 'H1',
</span><span class="cx" style="display: block; padding: 0 10px"> 'H2',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -95,11 +97,15 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'MAIN',
</span><span class="cx" style="display: block; padding: 0 10px"> 'MAP',
</span><span class="cx" style="display: block; padding: 0 10px"> 'MARK',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'MARQUEE', // Deprecated.
</ins><span class="cx" style="display: block; padding: 0 10px"> 'MENU',
</span><span class="cx" style="display: block; padding: 0 10px"> 'METER',
</span><span class="cx" style="display: block; padding: 0 10px"> 'MULTICOL', // Deprecated.
</span><span class="cx" style="display: block; padding: 0 10px"> 'NAV',
</span><span class="cx" style="display: block; padding: 0 10px"> 'NEXTID', // Deprecated.
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'NOBR', // Neutralized.
+ 'NOSCRIPT',
+ 'OBJECT',
</ins><span class="cx" style="display: block; padding: 0 10px"> 'OL',
</span><span class="cx" style="display: block; padding: 0 10px"> 'OUTPUT',
</span><span class="cx" style="display: block; padding: 0 10px"> 'P',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -106,6 +112,10 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'PICTURE',
</span><span class="cx" style="display: block; padding: 0 10px"> 'PROGRESS',
</span><span class="cx" style="display: block; padding: 0 10px"> 'Q',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'RB', // Neutralized.
+ 'RP',
+ 'RT',
+ 'RTC', // Neutralized.
</ins><span class="cx" style="display: block; padding: 0 10px"> 'RUBY',
</span><span class="cx" style="display: block; padding: 0 10px"> 'SAMP',
</span><span class="cx" style="display: block; padding: 0 10px"> 'SEARCH',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -119,6 +129,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'SUB',
</span><span class="cx" style="display: block; padding: 0 10px"> 'SUMMARY',
</span><span class="cx" style="display: block; padding: 0 10px"> 'SUP',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'TABLE',
</ins><span class="cx" style="display: block; padding: 0 10px"> 'TIME',
</span><span class="cx" style="display: block; padding: 0 10px"> 'TT',
</span><span class="cx" style="display: block; padding: 0 10px"> 'U',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -167,7 +178,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public static function data_unsupported_elements() {
</span><span class="cx" style="display: block; padding: 0 10px"> $unsupported_elements = array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'APPLET', // Deprecated.
</del><span class="cx" style="display: block; padding: 0 10px"> 'BASE',
</span><span class="cx" style="display: block; padding: 0 10px"> 'BGSOUND', // Deprecated; self-closing if self-closing flag provided, otherwise normal.
</span><span class="cx" style="display: block; padding: 0 10px"> 'BODY',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -174,7 +184,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'CAPTION',
</span><span class="cx" style="display: block; padding: 0 10px"> 'COL',
</span><span class="cx" style="display: block; padding: 0 10px"> 'COLGROUP',
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'FORM',
</del><span class="cx" style="display: block; padding: 0 10px"> 'FRAME',
</span><span class="cx" style="display: block; padding: 0 10px"> 'FRAMESET',
</span><span class="cx" style="display: block; padding: 0 10px"> 'HEAD',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -181,23 +190,14 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 'HTML',
</span><span class="cx" style="display: block; padding: 0 10px"> 'IFRAME',
</span><span class="cx" style="display: block; padding: 0 10px"> 'LINK',
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'MARQUEE', // Deprecated.
</del><span class="cx" style="display: block; padding: 0 10px"> 'MATH',
</span><span class="cx" style="display: block; padding: 0 10px"> 'META',
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'NOBR', // Neutralized.
</del><span class="cx" style="display: block; padding: 0 10px"> 'NOEMBED', // Neutralized.
</span><span class="cx" style="display: block; padding: 0 10px"> 'NOFRAMES', // Neutralized.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'NOSCRIPT',
- 'OBJECT',
</del><span class="cx" style="display: block; padding: 0 10px"> 'PLAINTEXT', // Neutralized.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'RB', // Neutralized.
- 'RP',
- 'RT',
- 'RTC', // Neutralized.
</del><span class="cx" style="display: block; padding: 0 10px"> 'SCRIPT',
</span><span class="cx" style="display: block; padding: 0 10px"> 'STYLE',
</span><span class="cx" style="display: block; padding: 0 10px"> 'SVG',
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'TABLE',
</del><span class="cx" style="display: block; padding: 0 10px"> 'TBODY',
</span><span class="cx" style="display: block; padding: 0 10px"> 'TD',
</span><span class="cx" style="display: block; padding: 0 10px"> 'TEMPLATE',
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlProcessorHtml5libphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlProcessorHtml5lib.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlProcessorHtml5lib.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlProcessorHtml5lib.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -31,25 +31,28 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * Skip specific tests that may not be supported or have known issues.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> const SKIP_TESTS = array(
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- 'adoption01/line0046' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'adoption01/line0159' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'adoption01/line0318' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'inbody01/line0001' => 'Bug.',
- 'inbody01/line0014' => 'Bug.',
- 'inbody01/line0029' => 'Bug.',
- 'menuitem-element/line0012' => 'Bug.',
- 'tests1/line0342' => "Closing P tag implicitly creates opener, which we don't visit.",
- 'tests1/line0720' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests15/line0001' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests15/line0022' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests2/line0650' => 'Whitespace only test never enters "in body" parsing mode.',
- 'tests20/line0497' => "Closing P tag implicitly creates opener, which we don't visit.",
- 'tests23/line0001' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests23/line0041' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests23/line0069' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests23/line0101' => 'Unimplemented: Reconstruction of active formatting elements.',
- 'tests25/line0169' => 'Bug.',
- 'tests26/line0263' => 'Bug: An active formatting element should be created for a trailing text node.',
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ 'adoption01/line0046' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'adoption01/line0159' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'adoption01/line0318' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests1/line0720' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests15/line0001' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests15/line0022' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests15/line0068' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'tests2/line0650' => 'Whitespace only test never enters "in body" parsing mode.',
+ 'tests19/line0965' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'tests23/line0001' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests23/line0041' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests23/line0069' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests23/line0101' => 'Unimplemented: Reconstruction of active formatting elements.',
+ 'tests26/line0263' => 'Bug: An active formatting element should be created for a trailing text node.',
+ 'webkit01/line0231' => 'Unimplemented: This parser does not add missing attributes to existing HTML or BODY tags.',
+ 'webkit02/line0013' => "Asserting behavior with scripting flag enabled, which this parser doesn't support.",
+ 'webkit01/line0300' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'webkit01/line0310' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'webkit01/line0336' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'webkit01/line0349' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'webkit01/line0362' => 'Unimplemented: no support outside of IN BODY yet.',
+ 'webkit01/line0375' => 'Unimplemented: no support outside of IN BODY yet.',
</ins><span class="cx" style="display: block; padding: 0 10px"> );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -198,19 +201,18 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> $output .= str_repeat( $indent, $tag_indent + 1 ) . "{$attribute_name}=\"{$val}\"\n";
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ }
</ins><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // Self-contained tags contain their inner contents as modifiable text.
- $modifiable_text = $processor->get_modifiable_text();
- if ( '' !== $modifiable_text ) {
- $was_text = true;
- if ( '' === $text_node ) {
- $text_node = str_repeat( $indent, $indent_level ) . '"';
- }
- $text_node .= $modifiable_text;
- --$indent_level;
- }
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ // Self-contained tags contain their inner contents as modifiable text.
+ $modifiable_text = $processor->get_modifiable_text();
+ if ( '' !== $modifiable_text ) {
+ $output .= str_repeat( $indent, $indent_level ) . "\"{$modifiable_text}\"\n";
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( ! $processor->is_void( $tag_name ) && ! $processor->expects_closer() ) {
+ --$indent_level;
+ }
+
</ins><span class="cx" style="display: block; padding: 0 10px"> break;
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> case '#text':
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -225,6 +227,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> switch ( $processor->get_comment_type() ) {
</span><span class="cx" style="display: block; padding: 0 10px"> case WP_HTML_Processor::COMMENT_AS_ABRUPTLY_CLOSED_COMMENT:
</span><span class="cx" style="display: block; padding: 0 10px"> case WP_HTML_Processor::COMMENT_AS_HTML_COMMENT:
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ case WP_HTML_Processor::COMMENT_AS_INVALID_HTML:
</ins><span class="cx" style="display: block; padding: 0 10px"> $comment_text_content = $processor->get_modifiable_text();
</span><span class="cx" style="display: block; padding: 0 10px"> break;
</span><span class="cx" style="display: block; padding: 0 10px">
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlProcessorSemanticRulesphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlProcessorSemanticRules.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlProcessorSemanticRules.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlProcessorSemanticRules.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -406,27 +406,22 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * Ensures that support isn't accidentally partially added for the closing BR tag `</br>`.
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Ensures that closing `</br>` tags are appropriately treated as opening tags with no attributes.
</ins><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * This tag closer has special rules and support shouldn't be added without implementing full support.
- *
</del><span class="cx" style="display: block; padding: 0 10px"> * > An end tag whose tag name is "br"
</span><span class="cx" style="display: block; padding: 0 10px"> * > Parse error. Drop the attributes from the token, and act as described in the next entry;
</span><span class="cx" style="display: block; padding: 0 10px"> * > i.e. act as if this was a "br" start tag token with no attributes, rather than the end
</span><span class="cx" style="display: block; padding: 0 10px"> * > tag token that it actually is.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- * When this handling is implemented, this test should be removed. It's not incorporated
- * into the existing unsupported tag behavior test because the opening tag is supported;
- * only the closing tag isn't.
- *
</del><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Processor::step_in_body
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @ticket 60283
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_br_end_tag_unsupported() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $processor = WP_HTML_Processor::create_fragment( '</br>' );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $processor = WP_HTML_Processor::create_fragment( '</br id="an-opener" html>' );
</ins><span class="cx" style="display: block; padding: 0 10px">
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- $this->assertFalse( $processor->next_tag(), 'Found a BR tag that should not be handled.' );
- $this->assertSame( WP_HTML_Processor::ERROR_UNSUPPORTED, $processor->get_last_error() );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ $this->assertTrue( $processor->next_tag(), 'Failed to find the expected opening BR tag.' );
+ $this->assertFalse( $processor->is_tag_closer(), 'Should have treated the tag as an opening tag.' );
+ $this->assertNull( $processor->get_attribute_names_with_prefix( '' ), 'Should have ignored any attributes on the tag.' );
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlSupportRequiredHtmlProcessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredHtmlProcessor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredHtmlProcessor.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredHtmlProcessor.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1,91 +0,0 @@
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-<?php
-/**
- * Unit tests for the HTML API indicating that changes are needed to the
- * WP_HTML_Processor class before specific features are added to the API.
- *
- * Note! Duplication of test cases and the helper function in this file are intentional.
- * This test file exists to warn developers of related areas of code that need to update
- * together when adding support for new elements to the HTML Processor. For example,
- * when adding support for the LI element it's necessary to update the function which
- * generates implied end tags. This is because each element might bring with it semantic
- * rules that impact the way the document should be parsed.
- *
- * Without these tests a developer needs to investigate all possible places they
- * might need to update when adding support for more elements and risks overlooking
- * important parts that, in the absence of the related support, will lead to errors.
- *
- * @package WordPress
- * @subpackage HTML-API
- *
- * @since 6.4.0
- *
- * @group html-api
- *
- * @coversDefaultClass WP_HTML_Processor
- */
-class Tests_HtmlApi_WpHtmlSupportRequiredHtmlProcessor extends WP_UnitTestCase {
- /**
- * Fails to assert if the HTML Processor handles the given tag.
- *
- * This test helper is used throughout this test file for one purpose only: to
- * fail a test if the HTML Processor handles the given tag. In other words, it
- * ensures that the HTML Processor aborts when encountering the given tag.
- *
- * This is used to ensure that when support for a new tag is added to the
- * HTML Processor it receives full support and not partial support, which
- * could lead to a variety of issues.
- *
- * Do not remove this helper function as it provides semantic meaning to the
- * assertions in the tests in this file and its behavior is incredibly specific
- * and limited and doesn't warrant adding a new abstraction into WP_UnitTestCase.
- *
- * @param string $tag_name the HTML Processor should abort when encountering this tag, e.g. "BUTTON".
- */
- private function ensure_support_is_added_everywhere( $tag_name ) {
- $processor = WP_HTML_Processor::create_fragment( "<$tag_name>" );
-
- $this->assertFalse( $processor->step(), "Must support terminating elements in specific scope check before adding support for the {$tag_name} element." );
- }
-
- /**
- * Generating implied end tags walks up the stack of open elements
- * as long as any of the following missing elements is the current node.
- *
- * @since 6.4.0
- *
- * @ticket 58907
- *
- * @covers WP_HTML_Processor::generate_implied_end_tags
- */
- public function test_generate_implied_end_tags_needs_support() {
- $this->ensure_support_is_added_everywhere( 'RB' );
- $this->ensure_support_is_added_everywhere( 'RP' );
- $this->ensure_support_is_added_everywhere( 'RT' );
- $this->ensure_support_is_added_everywhere( 'RTC' );
- }
-
- /**
- * Generating implied end tags thoroughly walks up the stack of open elements
- * as long as any of the following missing elements is the current node.
- *
- * @since 6.4.0
- *
- * @ticket 58907
- *
- * @covers WP_HTML_Processor::generate_implied_end_tags_thoroughly
- */
- public function test_generate_implied_end_tags_thoroughly_needs_support() {
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'COLGROUP' );
- $this->ensure_support_is_added_everywhere( 'RB' );
- $this->ensure_support_is_added_everywhere( 'RP' );
- $this->ensure_support_is_added_everywhere( 'RT' );
- $this->ensure_support_is_added_everywhere( 'RTC' );
- $this->ensure_support_is_added_everywhere( 'TBODY' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TFOOT' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'HEAD' );
- $this->ensure_support_is_added_everywhere( 'TR' );
- }
-}
</del></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlSupportRequiredOpenElementsphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredOpenElements.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredOpenElements.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlSupportRequiredOpenElements.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -61,17 +61,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @ticket 58517
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_has_element_in_scope_needs_support() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -99,17 +88,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::has_element_in_list_item_scope
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_has_element_in_list_item_scope_needs_support() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -133,17 +111,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::has_element_in_button_scope
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_has_element_in_button_scope_needs_support() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -168,17 +135,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::after_element_pop
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_after_element_pop_must_maintain_p_in_button_scope_flag() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -203,17 +159,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::after_element_push
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_after_element_push_must_maintain_p_in_button_scope_flag() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -237,17 +182,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::has_element_in_table_scope
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_has_element_in_table_scope_needs_support() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -258,22 +192,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * FOREIGNOBJECT, DESC, TITLE.
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'SVG' );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-
- // These elements are specific to TABLE scope.
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
- // These elements depend on table scope.
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'COL' );
- $this->ensure_support_is_added_everywhere( 'COLGROUP' );
- $this->ensure_support_is_added_everywhere( 'TBODY' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TFOOT' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'THEAD' );
- $this->ensure_support_is_added_everywhere( 'TR' );
</del><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -287,17 +205,6 @@
</span><span class="cx" style="display: block; padding: 0 10px"> * @covers WP_HTML_Open_Elements::has_element_in_select_scope
</span><span class="cx" style="display: block; padding: 0 10px"> */
</span><span class="cx" style="display: block; padding: 0 10px"> public function test_has_element_in_select_scope_needs_support() {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">- // These elements impact all scopes.
- $this->ensure_support_is_added_everywhere( 'APPLET' );
- $this->ensure_support_is_added_everywhere( 'CAPTION' );
- $this->ensure_support_is_added_everywhere( 'HTML' );
- $this->ensure_support_is_added_everywhere( 'TABLE' );
- $this->ensure_support_is_added_everywhere( 'TD' );
- $this->ensure_support_is_added_everywhere( 'TH' );
- $this->ensure_support_is_added_everywhere( 'MARQUEE' );
- $this->ensure_support_is_added_everywhere( 'OBJECT' );
- $this->ensure_support_is_added_everywhere( 'TEMPLATE' );
-
</del><span class="cx" style="display: block; padding: 0 10px"> // MathML Elements: MI, MO, MN, MS, MTEXT, ANNOTATION-XML.
</span><span class="cx" style="display: block; padding: 0 10px"> $this->ensure_support_is_added_everywhere( 'MATH' );
</span><span class="cx" style="display: block; padding: 0 10px">
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlTagProcessortokenscanningphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php 2024-07-22 17:50:53 UTC (rev 58778)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -57,6 +57,83 @@
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span><span class="cx" style="display: block; padding: 0 10px">
</span><span class="cx" style="display: block; padding: 0 10px"> /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * Ensures that `get_modifiable_text()` properly transforms text content.
+ *
+ * The newline and NULL byte (U+0000) behaviors can be complicated since they depend
+ * on where the bytes were found and whether they were raw bytes in the input stream
+ * or decoded from character references.
+ *
+ * @ticket 61576
+ *
+ * @dataProvider data_modifiable_text_needing_transformation
+ *
+ * @param string $html_with_target_node HTML with node containing `target` or `target-next` attribute.
+ * @param string $expected_modifiable_text Expected modifiable text from target node or following node.
+ */
+ public function test_modifiable_text_proper_transforms( string $html_with_target_node, string $expected_modifiable_text ) {
+ $processor = new WP_HTML_Tag_Processor( $html_with_target_node );
+
+ // Find the expected target node.
+ while ( $processor->next_token() ) {
+ $target = $processor->get_attribute( 'target' );
+ if ( true === $target ) {
+ break;
+ }
+
+ if ( is_numeric( $target ) ) {
+ for ( $i = (int) $target; $i > 0; $i-- ) {
+ $processor->next_token();
+ }
+ break;
+ }
+ }
+
+ $this->assertSame(
+ $expected_modifiable_text,
+ $processor->get_modifiable_text(),
+ "Should have properly decoded and transformed modifiable text, but didn't."
+ );
+ }
+
+ /**
+ * Data provider.
+ *
+ * @return array[].
+ */
+ public static function data_modifiable_text_needing_transformation() {
+ return array(
+ 'Text node + NULL byte' => array( "<span target=1>NULL byte in \x00 text nodes disappears.", 'NULL byte in text nodes disappears.' ),
+ 'LISTING + newline' => array( "<listing target=1>\nNo newline</listing>", 'No newline' ),
+ 'LISTING + CR + LF' => array( "<listing target=1>\r\nNo newline</listing>", 'No newline' ),
+ 'LISTING + Encoded LF' => array( '<listing target=1>
No newline</listing>', 'No newline' ),
+ 'LISTING + Encoded CR' => array( '<listing target=1>
Newline</listing>', "\rNewline" ),
+ 'LISTING + Encoded CR + LF' => array( '<listing target=1>
Newline</listing>', "\r\nNewline" ),
+ 'PRE + newline' => array( "<pre target=1>\nNo newline</pre>", 'No newline' ),
+ 'PRE + CR + LF' => array( "<pre target=1>\r\nNo newline</pre>", 'No newline' ),
+ 'PRE + Encoded LF' => array( '<pre target=1>
No newline</pre>', 'No newline' ),
+ 'PRE + Encoded CR' => array( '<pre target=1>
Newline</pre>', "\rNewline" ),
+ 'PRE + Encoded CR + LF' => array( '<pre target=1>
Newline</pre>', "\r\nNewline" ),
+ 'TEXTAREA + newline' => array( "<textarea target>\nNo newline</textarea>", 'No newline' ),
+ 'TEXTAREA + CR + LF' => array( "<textarea target>\r\nNo newline</textarea>", 'No newline' ),
+ 'TEXTAREA + Encoded LF' => array( '<textarea target>
No newline</textarea>', 'No newline' ),
+ 'TEXTAREA + Encoded CR' => array( '<textarea target>
Newline</textarea>', "\rNewline" ),
+ 'TEXTAREA + Encoded CR + LF' => array( '<textarea target>
Newline</textarea>', "\r\nNewline" ),
+ 'TEXTAREA + Comment-like' => array( "<textarea target><!-- comment -->\nNo newline</textarea>", "<!-- comment -->\nNo newline" ),
+ 'PRE + Comment' => array( "<pre target=2><!-- comment -->\nNo newline</pre>", "\nNo newline" ),
+ 'PRE + CDATA-like' => array( "<pre target=2><![CDATA[test]]>\nNo newline</pre>", "\nNo newline" ),
+ 'LISTING + NULL byte' => array( "<listing target=1>\x00 is missing</listing>", ' is missing' ),
+ 'PRE + NULL byte' => array( "<pre target=1>\x00 is missing</pre>", ' is missing' ),
+ 'TEXTAREA + NULL byte' => array( "<textarea target>\x00 is U+FFFD</textarea>", "\u{FFFD} is U+FFFD" ),
+ 'SCRIPT + NULL byte' => array( "<script target>\x00 is U+FFFD</script>", "\u{FFFD} is U+FFFD" ),
+ 'esc(SCRIPT) + NULL byte' => array( "<script target><!-- <script> \x00 </script> --> is U+FFFD</script>", "<!-- <script> \u{FFFD} </script> --> is U+FFFD" ),
+ 'STYLE + NULL byte' => array( "<style target>\x00 is U+FFFD</style>", "\u{FFFD} is U+FFFD" ),
+ 'XMP + NULL byte' => array( "<xmp target>\x00 is U+FFFD</xmp>", "\u{FFFD} is U+FFFD" ),
+ 'CDATA-like + NULL byte' => array( "<span target=1><![CDATA[just a \x00comment]]>", "just a \u{FFFD}comment" ),
+ 'Funky comment + NULL byte' => array( "<span target=1></%just a \x00comment>", "%just a \u{FFFD}comment" ),
+ );
+ }
+
+ /**
</ins><span class="cx" style="display: block; padding: 0 10px"> * Ensures that normative Elements are properly parsed.
</span><span class="cx" style="display: block; padding: 0 10px"> *
</span><span class="cx" style="display: block; padding: 0 10px"> * @ticket 60170
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlTagProcessorModifiableTextphp"></a>
<div class="addfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Added: trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php (rev 0)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php 2024-07-22 22:22:03 UTC (rev 58779)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -0,0 +1,111 @@
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+<?php
+/**
+ * Unit tests covering WP_HTML_Tag_Processor modifiable text functionality.
+ *
+ * @package WordPress
+ * @subpackage HTML-API
+ * @group html-api
+ *
+ * @coversDefaultClass WP_HTML_Tag_Processor
+ */
+class Tests_HtmlApi_WpHtmlTagProcessorModifiableText extends WP_UnitTestCase {
+ /**
+ * Ensures that calls to `get_modifiable_text()` don't change the
+ * parser state in a way that would corrupt repeated calls.
+ *
+ * @ticket 61576
+ */
+ public function test_get_modifiable_text_is_idempotent() {
+ $processor = new WP_HTML_Tag_Processor( "<pre>\nFirst newline ignored.</pre>" );
+
+ // Find the text node in the middle.
+ while ( '#text' !== $processor->get_token_name() && $processor->next_token() ) {
+ continue;
+ }
+
+ $this->assertSame(
+ '#text',
+ $processor->get_token_name(),
+ 'Failed to find text node under test: check test setup.'
+ );
+
+ // The count of 5 isn't important; but calling this multiple times is.
+ for ( $i = 0; $i < 5; $i++ ) {
+ $this->assertSame(
+ 'First newline ignored.',
+ $processor->get_modifiable_text(),
+ 'Should have returned the same modifiable text regardless of how many times it was called.'
+ );
+ }
+ }
+
+ /**
+ * Ensures that when ignoring a newline after LISTING and PRE tags, that this
+ * happens appropriately after seeking.
+ */
+ public function test_get_modifiable_text_ignores_newlines_after_seeking() {
+ $processor = new WP_HTML_Tag_Processor(
+ <<<HTML
+<span>\nhere</span>
+<listing>\ngone</listing>
+<pre>reset last known ignore-point</pre>
+<div>\nhere</div>
+HTML
+ );
+
+ $processor->next_tag( 'SPAN' );
+ $processor->next_token();
+ $processor->set_bookmark( 'span' );
+
+ $this->assertSame(
+ "\nhere",
+ $processor->get_modifiable_text(),
+ 'Should not have removed the leading newline from the first SPAN.'
+ );
+
+ $processor->next_tag( 'LISTING' );
+ $processor->next_token();
+ $processor->set_bookmark( 'listing' );
+
+ $this->assertSame(
+ 'gone',
+ $processor->get_modifiable_text(),
+ 'Should have stripped the leading newline from the LISTING element on first traversal.'
+ );
+
+ $processor->next_tag( 'DIV' );
+ $processor->next_token();
+ $processor->set_bookmark( 'div' );
+
+ $this->assertSame(
+ "\nhere",
+ $processor->get_modifiable_text(),
+ 'Should not have removed the leading newline from the last DIV.'
+ );
+
+ $processor->seek( 'span' );
+ $this->assertSame(
+ "\nhere",
+ $processor->get_modifiable_text(),
+ 'Should not have removed the leading newline from the first SPAN on its second traversal.'
+ );
+
+ $processor->seek( 'listing' );
+ if ( "\ngone" === $processor->get_modifiable_text() ) {
+ $this->markTestSkipped( "There's no support currently for handling the leading newline after seeking." );
+ }
+
+ $this->assertSame(
+ 'gone',
+ $processor->get_modifiable_text(),
+ 'Should have remembered to remote leading newline from LISTING element after seeking around it.'
+ );
+
+ $processor->seek( 'div' );
+ $this->assertSame(
+ "\nhere",
+ $processor->get_modifiable_text(),
+ 'Should not have removed the leading newline from the last DIV on its second traversal.'
+ );
+ }
+}
</ins><span class="cx" style="display: block; padding: 0 10px">Property changes on: trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php
</span><span class="cx" style="display: block; padding: 0 10px">___________________________________________________________________
</span></span></pre></div>
<a id="svneolstyle"></a>
<div class="addfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Added: svn:eol-style</h4></div>
<ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+native
</ins><span class="cx" style="display: block; padding: 0 10px">\ No newline at end of property
</span></div>
</body>
</html>