<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[58613] trunk/src/wp-includes/html-api: HTML API: Optimize low-level parsing details in Tag Processor.</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { white-space: pre-line; overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/58613">58613</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/58613","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>dmsnell</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2024-07-01 23:34:19 +0000 (Mon, 01 Jul 2024)</dd>
</dl>

<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>HTML API: Optimize low-level parsing details in Tag Processor.

Introduces a number of micro-level optimizations in the Tag Processor to
improve token-scanning performance. Should contain no functional changes.

Based on benchmarking against a list of the 100 most-visited websites,
these changes result in an average improvement in performance of the Tag
Processor for scanning tags from between 3.5% and 7.5%.

Developed in https://github.com/WordPress/wordpress-develop/pull/6890
Discussed in https://core.trac.wordpress.org/ticket/61545

Follow-up to <a href="https://core.trac.wordpress.org/changeset/55203">[55203]</a>.

See <a href="https://core.trac.wordpress.org/ticket/61545">#61545</a>.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmldecoderphp">trunk/src/wp-includes/html-api/class-wp-html-decoder.php</a></li>
<li><a href="#trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp">trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpincludeshtmlapiclasswphtmldecoderphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-decoder.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-decoder.php  2024-07-01 20:43:55 UTC (rev 58612)
+++ trunk/src/wp-includes/html-api/class-wp-html-decoder.php    2024-07-01 23:34:19 UTC (rev 58613)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -141,7 +141,7 @@
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                while ( $at < $end ) {
</span><span class="cx" style="display: block; padding: 0 10px">                        $next_character_reference_at = strpos( $text, '&', $at );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        if ( false === $next_character_reference_at || $next_character_reference_at >= $end ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 if ( false === $next_character_reference_at ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                 break;
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -436,26 +436,26 @@
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                if ( $code_point <= 0x7FF ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        $byte1 = ( $code_point >> 6 ) | 0xC0;
-                       $byte2 = $code_point & 0x3F | 0x80;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 $byte1 = chr( ( $code_point >> 6 ) | 0xC0 );
+                       $byte2 = chr( $code_point & 0x3F | 0x80 );
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        return pack( 'CC', $byte1, $byte2 );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 return "{$byte1}{$byte2}";
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                if ( $code_point <= 0xFFFF ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        $byte1 = ( $code_point >> 12 ) | 0xE0;
-                       $byte2 = ( $code_point >> 6 ) & 0x3F | 0x80;
-                       $byte3 = $code_point & 0x3F | 0x80;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 $byte1 = chr( ( $code_point >> 12 ) | 0xE0 );
+                       $byte2 = chr( ( $code_point >> 6 ) & 0x3F | 0x80 );
+                       $byte3 = chr( $code_point & 0x3F | 0x80 );
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        return pack( 'CCC', $byte1, $byte2, $byte3 );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 return "{$byte1}{$byte2}{$byte3}";
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // Any values above U+10FFFF are eliminated above in the pre-check.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                $byte1 = ( $code_point >> 18 ) | 0xF0;
-               $byte2 = ( $code_point >> 12 ) & 0x3F | 0x80;
-               $byte3 = ( $code_point >> 6 ) & 0x3F | 0x80;
-               $byte4 = $code_point & 0x3F | 0x80;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         $byte1 = chr( ( $code_point >> 18 ) | 0xF0 );
+               $byte2 = chr( ( $code_point >> 12 ) & 0x3F | 0x80 );
+               $byte3 = chr( ( $code_point >> 6 ) & 0x3F | 0x80 );
+               $byte4 = chr( $code_point & 0x3F | 0x80 );
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                return pack( 'CCCC', $byte1, $byte2, $byte3, $byte4 );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         return "{$byte1}{$byte2}{$byte3}{$byte4}";
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre></div>
<a id="trunksrcwpincludeshtmlapiclasswphtmltagprocessorphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php    2024-07-01 20:43:55 UTC (rev 58612)
+++ trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php      2024-07-01 23:34:19 UTC (rev 58613)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1524,21 +1524,10 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $was_at     = $this->bytes_already_parsed;
</span><span class="cx" style="display: block; padding: 0 10px">                $at         = $was_at;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                while ( false !== $at && $at < $doc_length ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         while ( $at < $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $at = strpos( $html, '<', $at );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-
-                       /*
-                        * This does not imply an incomplete parse; it indicates that there
-                        * can be nothing left in the document other than a #text node.
-                        */
</del><span class="cx" style="display: block; padding: 0 10px">                         if ( false === $at ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                $this->parser_state         = self::STATE_TEXT_NODE;
-                               $this->token_starts_at      = $was_at;
-                               $this->token_length         = strlen( $html ) - $was_at;
-                               $this->text_starts_at       = $was_at;
-                               $this->text_length          = $this->token_length;
-                               $this->bytes_already_parsed = strlen( $html );
-                               return true;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         break;
</ins><span class="cx" style="display: block; padding: 0 10px">                         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        if ( $at > $was_at ) {
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1554,19 +1543,9 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                 *
</span><span class="cx" style="display: block; padding: 0 10px">                                 * @see https://html.spec.whatwg.org/#tag-open-state
</span><span class="cx" style="display: block; padding: 0 10px">                                 */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                if ( strlen( $html ) > $at + 1 ) {
-                                       $next_character  = $html[ $at + 1 ];
-                                       $at_another_node = (
-                                               '!' === $next_character ||
-                                               '/' === $next_character ||
-                                               '?' === $next_character ||
-                                               ( 'A' <= $next_character && $next_character <= 'Z' ) ||
-                                               ( 'a' <= $next_character && $next_character <= 'z' )
-                                       );
-                                       if ( ! $at_another_node ) {
-                                               ++$at;
-                                               continue;
-                                       }
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         if ( 1 !== strspn( $html, '!/?abcdefghijklmnopqrstuvwxyzABCEFGHIJKLMNOPQRSTUVWXYZ', $at + 1, 1 ) ) {
+                                       ++$at;
+                                       continue;
</ins><span class="cx" style="display: block; padding: 0 10px">                                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                                $this->parser_state         = self::STATE_TEXT_NODE;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1630,11 +1609,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                 * `<!--` transitions to a comment state – apply further comment rules.
</span><span class="cx" style="display: block; padding: 0 10px">                                 * https://html.spec.whatwg.org/multipage/parsing.html#tag-open-state
</span><span class="cx" style="display: block; padding: 0 10px">                                 */
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                if (
-                                       $doc_length > $at + 3 &&
-                                       '-' === $html[ $at + 2 ] &&
-                                       '-' === $html[ $at + 3 ]
-                               ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         if ( 0 === substr_compare( $html, '--', $at + 2, 2 ) ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                         $closer_at = $at + 4;
</span><span class="cx" style="display: block; padding: 0 10px">                                        // If it's not possible to close the comment then there is nothing more to scan.
</span><span class="cx" style="display: block; padding: 0 10px">                                        if ( $doc_length <= $closer_at ) {
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1911,7 +1886,17 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        ++$at;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                return false;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         /*
+                * This does not imply an incomplete parse; it indicates that there
+                * can be nothing left in the document other than a #text node.
+                */
+               $this->parser_state         = self::STATE_TEXT_NODE;
+               $this->token_starts_at      = $was_at;
+               $this->token_length         = $doc_length - $was_at;
+               $this->text_starts_at       = $was_at;
+               $this->text_length          = $this->token_length;
+               $this->bytes_already_parsed = $doc_length;
+               return true;
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">        /**
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1922,9 +1907,11 @@
</span><span class="cx" style="display: block; padding: 0 10px">         * @return bool Whether an attribute was found before the end of the document.
</span><span class="cx" style="display: block; padding: 0 10px">         */
</span><span class="cx" style="display: block; padding: 0 10px">        private function parse_next_attribute() {
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                $doc_length = strlen( $this->html );
+
</ins><span class="cx" style="display: block; padding: 0 10px">                 // Skip whitespace and slashes.
</span><span class="cx" style="display: block; padding: 0 10px">                $this->bytes_already_parsed += strspn( $this->html, " \t\f\r\n/", $this->bytes_already_parsed );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( $this->bytes_already_parsed >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( $this->bytes_already_parsed >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->parser_state = self::STATE_INCOMPLETE_INPUT;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1941,7 +1928,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        : strcspn( $this->html, "=/> \t\f\r\n", $this->bytes_already_parsed );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // No attribute, just tag closer.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( 0 === $name_length || $this->bytes_already_parsed + $name_length >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( 0 === $name_length || $this->bytes_already_parsed + $name_length >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         return false;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1948,7 +1935,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $attribute_start             = $this->bytes_already_parsed;
</span><span class="cx" style="display: block; padding: 0 10px">                $attribute_name              = substr( $this->html, $attribute_start, $name_length );
</span><span class="cx" style="display: block; padding: 0 10px">                $this->bytes_already_parsed += $name_length;
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( $this->bytes_already_parsed >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( $this->bytes_already_parsed >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->parser_state = self::STATE_INCOMPLETE_INPUT;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1955,7 +1942,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                $this->skip_whitespace();
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( $this->bytes_already_parsed >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( $this->bytes_already_parsed >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->parser_state = self::STATE_INCOMPLETE_INPUT;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1965,7 +1952,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                if ( $has_value ) {
</span><span class="cx" style="display: block; padding: 0 10px">                        ++$this->bytes_already_parsed;
</span><span class="cx" style="display: block; padding: 0 10px">                        $this->skip_whitespace();
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        if ( $this->bytes_already_parsed >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 if ( $this->bytes_already_parsed >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                 $this->parser_state = self::STATE_INCOMPLETE_INPUT;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                                return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1976,8 +1963,10 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                case '"':
</span><span class="cx" style="display: block; padding: 0 10px">                                        $quote                      = $this->html[ $this->bytes_already_parsed ];
</span><span class="cx" style="display: block; padding: 0 10px">                                        $value_start                = $this->bytes_already_parsed + 1;
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                        $value_length               = strcspn( $this->html, $quote, $value_start );
-                                       $attribute_end              = $value_start + $value_length + 1;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                 $end_quote_at               = strpos( $this->html, $quote, $value_start );
+                                       $end_quote_at               = false === $end_quote_at ? $doc_length : $end_quote_at;
+                                       $value_length               = $end_quote_at - $value_start;
+                                       $attribute_end              = $end_quote_at + 1;
</ins><span class="cx" style="display: block; padding: 0 10px">                                         $this->bytes_already_parsed = $attribute_end;
</span><span class="cx" style="display: block; padding: 0 10px">                                        break;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -1993,7 +1982,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        $attribute_end = $attribute_start + $name_length;
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( $attribute_end >= strlen( $this->html ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( $attribute_end >= $doc_length ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->parser_state = self::STATE_INCOMPLETE_INPUT;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                        return false;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2014,7 +2003,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $comparable_name = strtolower( $attribute_name );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // If an attribute is listed many times, only use the first declaration and ignore the rest.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( ! array_key_exists( $comparable_name, $this->attributes ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( ! isset( $this->attributes[ $comparable_name ] ) ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->attributes[ $comparable_name ] = new WP_HTML_Attribute_Token(
</span><span class="cx" style="display: block; padding: 0 10px">                                $attribute_name,
</span><span class="cx" style="display: block; padding: 0 10px">                                $value_start,
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2038,7 +2027,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $duplicate_span = new WP_HTML_Span( $attribute_start, $attribute_end - $attribute_start );
</span><span class="cx" style="display: block; padding: 0 10px">                if ( null === $this->duplicate_attributes ) {
</span><span class="cx" style="display: block; padding: 0 10px">                        $this->duplicate_attributes = array( $comparable_name => array( $duplicate_span ) );
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                } elseif ( ! array_key_exists( $comparable_name, $this->duplicate_attributes ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         } elseif ( ! isset( $this->duplicate_attributes[ $comparable_name ] ) ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                         $this->duplicate_attributes[ $comparable_name ] = array( $duplicate_span );
</span><span class="cx" style="display: block; padding: 0 10px">                } else {
</span><span class="cx" style="display: block; padding: 0 10px">                        $this->duplicate_attributes[ $comparable_name ][] = $duplicate_span;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -3110,14 +3099,12 @@
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // Removes any duplicated attributes if they were also present.
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( null !== $this->duplicate_attributes && array_key_exists( $name, $this->duplicate_attributes ) ) {
-                       foreach ( $this->duplicate_attributes[ $name ] as $attribute_token ) {
-                               $this->lexical_updates[] = new WP_HTML_Text_Replacement(
-                                       $attribute_token->start,
-                                       $attribute_token->length,
-                                       ''
-                               );
-                       }
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         foreach ( $this->duplicate_attributes[ $name ] ?? array() as $attribute_token ) {
+                       $this->lexical_updates[] = new WP_HTML_Text_Replacement(
+                               $attribute_token->start,
+                               $attribute_token->length,
+                               ''
+                       );
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                return true;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -3317,35 +3304,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                // Does the tag name match the requested tag name in a case-insensitive manner?
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                if ( null !== $this->sought_tag_name ) {
-                       /*
-                        * String (byte) length lookup is fast. If they aren't the
-                        * same length then they can't be the same string values.
-                        */
-                       if ( strlen( $this->sought_tag_name ) !== $this->tag_name_length ) {
-                               return false;
-                       }
-
-                       /*
-                        * Check each character to determine if they are the same.
-                        * Defer calls to `strtoupper()` to avoid them when possible.
-                        * Calling `strcasecmp()` here tested slowed than comparing each
-                        * character, so unless benchmarks show otherwise, it should
-                        * not be used.
-                        *
-                        * It's expected that most of the time that this runs, a
-                        * lower-case tag name will be supplied and the input will
-                        * contain lower-case tag names, thus normally bypassing
-                        * the case comparison code.
-                        */
-                       for ( $i = 0; $i < $this->tag_name_length; $i++ ) {
-                               $html_char = $this->html[ $this->tag_name_starts_at + $i ];
-                               $tag_char  = $this->sought_tag_name[ $i ];
-
-                               if ( $html_char !== $tag_char && strtoupper( $html_char ) !== $tag_char ) {
-                                       return false;
-                               }
-                       }
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( isset( $this->sought_tag_name ) && 0 !== substr_compare( $this->html, $this->sought_tag_name, $this->tag_name_starts_at, $this->tag_name_length, true ) ) {
+                       return false;
</ins><span class="cx" style="display: block; padding: 0 10px">                 }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                if ( null !== $this->sought_class_name && ! $this->has_class( $this->sought_class_name ) ) {
</span></span></pre>
</div>
</div>

</body>
</html>