<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[58192] trunk: HTML API: Add `expects_closer()` method to HTML Processor</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { white-space: pre-line; overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/58192">58192</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/58192","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>dmsnell</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2024-05-24 01:19:10 +0000 (Fri, 24 May 2024)</dd>
</dl>

<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>HTML API: Add `expects_closer()` method to HTML Processor

This patch adds a new method, `WP_HTML_Processor->expects_closer()` to indicate
if the currently-matched node expects to find a closing token. For example, a
`DIV` element expects a closing `</div>` tag, but an `<img>` expects none, because
it's a void element. Similarly, `#text` nodes and HTML comments only appear as
unitary nodes on the stack of open elements. Once proceeding further in the
document they are immediately removed without any closing tag.

This new method serves as a helper to indicate whether or not to expect the
closer, as this can be more complicated than it seems, and calling code
shouldn't have to build custom interpretations and implementations. Instead,
the HTML Processor ought to export its internal knowledge to make it easy for
consuming code and projects.

Developed in https://github.com/WordPress/wordpress-develop/pull/6600
Discussed in https://core.trac.wordpress.org/ticket/61257

Fixes <a href="https://core.trac.wordpress.org/ticket/61257">#61257</a>.
Props dmsnell, jonsurrell.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmlprocessorphp">trunk/src/wp-includes/html-api/class-wp-html-processor.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlProcessorphp">trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpincludeshtmlapiclasswphtmlprocessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-processor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-processor.php        2024-05-23 23:35:52 UTC (rev 58191)
+++ trunk/src/wp-includes/html-api/class-wp-html-processor.php  2024-05-24 01:19:10 UTC (rev 58192)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -509,6 +509,44 @@
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         * Indicates if the currently-matched node expects a closing
+        * token, or if it will self-close on the next step.
+        *
+        * Most HTML elements expect a closer, such as a P element or
+        * a DIV element. Others, like an IMG element are void and don't
+        * have a closing tag. Special elements, such as SCRIPT and STYLE,
+        * are treated just like void tags. Text nodes and self-closing
+        * foreign content will also act just like a void tag, immediately
+        * closing as soon as the processor advances to the next token.
+        *
+        * @since 6.6.0
+        *
+        * @todo When adding support for foreign content, ensure that
+        *       this returns false for self-closing elements in the
+        *       SVG and MathML namespace.
+        *
+        * @return bool Whether to expect a closer for the currently-matched node,
+        *              or `null` if not matched on any token.
+        */
+       public function expects_closer() {
+               $token_name = $this->get_token_name();
+               if ( ! isset( $token_name ) ) {
+                       return null;
+               }
+
+               return ! (
+                       // Comments, text nodes, and other atomic tokens.
+                       '#' === $token_name[0] ||
+                       // Doctype declarations.
+                       'html' === $token_name ||
+                       // Void elements.
+                       self::is_void( $token_name ) ||
+                       // Special atomic elements.
+                       in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true )
+               );
+       }
+
+       /**
</ins><span class="cx" style="display: block; padding: 0 10px">          * Steps through the HTML document and stop at the next tag, if any.
</span><span class="cx" style="display: block; padding: 0 10px">         *
</span><span class="cx" style="display: block; padding: 0 10px">         * @since 6.4.0
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlProcessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php    2024-05-23 23:35:52 UTC (rev 58191)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlProcessor.php      2024-05-24 01:19:10 UTC (rev 58192)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -183,6 +183,103 @@
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         * Ensure reporting that normal non-void HTML elements expect a closer.
+        *
+        * @ticket 61257
+        */
+       public function test_expects_closer_regular_tags() {
+               $processor = WP_HTML_Processor::create_fragment( '<div><p><b><em>' );
+
+               $tags = 0;
+               while ( $processor->next_tag() ) {
+                       $this->assertTrue(
+                               $processor->expects_closer(),
+                               "Should have expected a closer for '{$processor->get_tag()}', but didn't."
+                       );
+                       ++$tags;
+               }
+
+               $this->assertSame(
+                       4,
+                       $tags,
+                       'Did not find all the expected tags.'
+               );
+       }
+
+       /**
+        * Ensure reporting that non-tag HTML nodes expect a closer.
+        *
+        * @ticket 61257
+        *
+        * @dataProvider data_self_contained_node_tokens
+        *
+        * @param string $self_contained_token String starting with HTML token that doesn't expect a closer,
+        *                                     e.g. an HTML comment, text node, void tag, or special element.
+        */
+       public function test_expects_closer_expects_no_closer_for_self_contained_tokens( $self_contained_token ) {
+               $processor   = WP_HTML_Processor::create_fragment( $self_contained_token );
+               $found_token = $processor->next_token();
+
+               if ( WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error() ) {
+                       $this->markTestSkipped( "HTML '{$self_contained_token}' is not supported." );
+               }
+
+               $this->assertTrue(
+                       $found_token,
+                       "Failed to find any tokens in '{$self_contained_token}': check test data provider."
+               );
+
+               $this->assertFalse(
+                       $processor->expects_closer(),
+                       "Incorrectly expected a closer for node of type '{$processor->get_token_type()}'."
+               );
+       }
+
+       /**
+        * Data provider.
+        *
+        * @return array[]
+        */
+       public static function data_self_contained_node_tokens() {
+               $self_contained_nodes = array(
+                       'Normative comment'                => array( '<!-- comment -->' ),
+                       'Comment with invalid closing'     => array( '<!-- comment --!>' ),
+                       'CDATA Section lookalike'          => array( '<![CDATA[ comment ]]>' ),
+                       'Processing Instruction lookalike' => array( '<?ok comment ?>' ),
+                       'Funky comment'                    => array( '<//wp:post-meta key=isbn>' ),
+                       'Text node'                        => array( 'Trombone' ),
+               );
+
+               foreach ( self::data_void_tags() as $tag_name => $_name ) {
+                       $self_contained_nodes[ "Void elements ({$tag_name})" ] = array( "<{$tag_name}>" );
+               }
+
+               foreach ( self::data_special_tags() as $tag_name => $_name ) {
+                       $self_contained_nodes[ "Special atomic elements ({$tag_name})" ] = array( "<{$tag_name}>content</{$tag_name}>" );
+               }
+
+               return $self_contained_nodes;
+       }
+
+       /**
+        * Data provider.
+        *
+        * @return array[]
+        */
+       public static function data_special_tags() {
+               return array(
+                       'IFRAME'   => array( 'IFRAME' ),
+                       'NOEMBED'  => array( 'NOEMBED' ),
+                       'NOFRAMES' => array( 'NOFRAMES' ),
+                       'SCRIPT'   => array( 'SCRIPT' ),
+                       'STYLE'    => array( 'STYLE' ),
+                       'TEXTAREA' => array( 'TEXTAREA' ),
+                       'TITLE'    => array( 'TITLE' ),
+                       'XMP'      => array( 'XMP' ),
+               );
+       }
+
+       /**
</ins><span class="cx" style="display: block; padding: 0 10px">          * Ensure non-nesting tags do not nest when processing tokens.
</span><span class="cx" style="display: block; padding: 0 10px">         *
</span><span class="cx" style="display: block; padding: 0 10px">         * @ticket 60382
</span></span></pre>
</div>
</div>

</body>
</html>