<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[57506] trunk: HTML API: Fix CDATA lookalike matching invalid CDATA</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { white-space: pre-line; overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/57506">57506</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/57506","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>dmsnell</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2024-02-01 00:10:19 +0000 (Thu, 01 Feb 2024)</dd>
</dl>

<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>HTML API: Fix CDATA lookalike matching invalid CDATA

When `next_token()` was introduced to the HTML Tag Processor, it started
classifying comments that look like they were intended to be CDATA sections.
In one of the changes made during development, however, a typo slipped
through code review that treated comments as CDATA even if they only
ended in `]>` and not the required `]]>`.

The consequences of this defect were minor because in all cases these are
treated as HTML comments from invalid syntax, but this patch adds the
missing check to ensure the proper reporting of CDATA-lookalikes.

Follow-up to <a href="https://core.trac.wordpress.org/changeset/57348">[57348]</a>

Props jonsurrell
Fixes <a href="https://core.trac.wordpress.org/ticket/60406">#60406</a></pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp">trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</a></li>
<li><a href="#trunktestsphpunittestshtmlapiwpHtmlTagProcessortokenscanningphp">trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php    2024-01-31 21:49:08 UTC (rev 57505)
+++ trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php      2024-02-01 00:10:19 UTC (rev 57506)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1762,7 +1762,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                        'T' === $html[ $this->token_starts_at + 6 ] &&
</span><span class="cx" style="display: block; padding: 0 10px">                                        'A' === $html[ $this->token_starts_at + 7 ] &&
</span><span class="cx" style="display: block; padding: 0 10px">                                        '[' === $html[ $this->token_starts_at + 8 ] &&
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                        ']' === $html[ $closer_at - 1 ]
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                 ']' === $html[ $closer_at - 1 ] &&
+                                       ']' === $html[ $closer_at - 2 ]
</ins><span class="cx" style="display: block; padding: 0 10px">                                 ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                        $this->parser_state    = self::STATE_COMMENT;
</span><span class="cx" style="display: block; padding: 0 10px">                                        $this->comment_type    = self::COMMENT_AS_CDATA_LOOKALIKE;
</span></span></pre></div>
<a id="trunktestsphpunittestshtmlapiwpHtmlTagProcessortokenscanningphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php  2024-01-31 21:49:08 UTC (rev 57505)
+++ trunk/tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php    2024-02-01 00:10:19 UTC (rev 57506)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -348,6 +348,38 @@
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         * Ensures that normative CDATA sections are properly parsed.
+        *
+        * @ticket 60406
+        *
+        * @since 6.5.0
+        *
+        * @covers WP_HTML_Tag_Processor::next_token
+        */
+       public function test_cdata_comment_with_incorrect_closer() {
+               $processor = new WP_HTML_Tag_Processor( '<![CDATA[this is missing a closing square bracket]>' );
+               $processor->next_token();
+
+               $this->assertSame(
+                       '#comment',
+                       $processor->get_token_name(),
+                       "Should have found comment token but found {$processor->get_token_name()} instead."
+               );
+
+               $this->assertSame(
+                       WP_HTML_Processor::COMMENT_AS_INVALID_HTML,
+                       $processor->get_comment_type(),
+                       'Should have detected invalid HTML comment.'
+               );
+
+               $this->assertSame(
+                       '[CDATA[this is missing a closing square bracket]',
+                       $processor->get_modifiable_text(),
+                       'Found incorrect modifiable text.'
+               );
+       }
+
+       /**
</ins><span class="cx" style="display: block; padding: 0 10px">          * Ensures that abruptly-closed CDATA sections are properly parsed as comments.
</span><span class="cx" style="display: block; padding: 0 10px">         *
</span><span class="cx" style="display: block; padding: 0 10px">         * @ticket 60170
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -366,6 +398,12 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        "Should have found a bogus comment but found {$processor->get_token_name()} instead."
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                $this->assertSame(
+                       WP_HTML_Processor::COMMENT_AS_INVALID_HTML,
+                       $processor->get_comment_type(),
+                       'Should have detected invalid HTML comment.'
+               );
+
</ins><span class="cx" style="display: block; padding: 0 10px">                 $this->assertNull(
</span><span class="cx" style="display: block; padding: 0 10px">                        $processor->get_tag(),
</span><span class="cx" style="display: block; padding: 0 10px">                        'Should not have been able to query tag name on non-element token.'
</span></span></pre>
</div>
</div>

</body>
</html>