<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[45929] trunk: Formatting: Improve accuracy of `force_balance_tags()` and add support for custom element tags.</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { white-space: pre-line; overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta" style="font-size: 105%">
<dt style="float: left; width: 6em; font-weight: bold">Revision</dt> <dd><a style="font-weight: bold" href="https://core.trac.wordpress.org/changeset/45929">45929</a><script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","description":"Review this Commit","action":{"@type":"ViewAction","url":"https://core.trac.wordpress.org/changeset/45929","name":"Review Commit"}}</script></dd>
<dt style="float: left; width: 6em; font-weight: bold">Author</dt> <dd>flixos90</dd>
<dt style="float: left; width: 6em; font-weight: bold">Date</dt> <dd>2019-09-02 10:24:18 +0000 (Mon, 02 Sep 2019)</dd>
</dl>

<pre style='padding-left: 1em; margin: 2em 0; border-left: 2px solid #ccc; line-height: 1.25; font-size: 105%; font-family: sans-serif'>Formatting: Improve accuracy of `force_balance_tags()` and add support for custom element tags.

This changeset includes a major iteration on the regular expression used to balance tags, with comprehensive test coverage to ensure that all scenarios are supported or unsupported as expected.

Props dmsnell, westonruter, birgire.
Fixes <a href="https://core.trac.wordpress.org/ticket/47014">#47014</a>.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpincludesformattingphp">trunk/src/wp-includes/formatting.php</a></li>
<li><a href="#trunktestsphpunittestsformattingbalanceTagsphp">trunk/tests/phpunit/tests/formatting/balanceTags.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpincludesformattingphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/src/wp-includes/formatting.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/src/wp-includes/formatting.php      2019-09-02 02:26:55 UTC (rev 45928)
+++ trunk/src/wp-includes/formatting.php        2019-09-02 10:24:18 UTC (rev 45929)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2429,7 +2429,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">  * @return string Balanced text
</span><span class="cx" style="display: block; padding: 0 10px">  */
</span><span class="cx" style="display: block; padding: 0 10px"> function balanceTags( $text, $force = false ) {  // phpcs:ignore WordPress.NamingConventions.ValidFunctionName.FunctionNameInvalid
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        if ( $force || get_option( 'use_balanceTags' ) == 1 ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ if ( $force || (int) get_option( 'use_balanceTags' ) === 1 ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                 return force_balance_tags( $text );
</span><span class="cx" style="display: block; padding: 0 10px">        } else {
</span><span class="cx" style="display: block; padding: 0 10px">                return $text;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2440,6 +2440,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">  * Balances tags of string using a modified stack.
</span><span class="cx" style="display: block; padding: 0 10px">  *
</span><span class="cx" style="display: block; padding: 0 10px">  * @since 2.0.4
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ * @since 5.3.0 Improve accuracy and add support for custom element tags.
</ins><span class="cx" style="display: block; padding: 0 10px">  *
</span><span class="cx" style="display: block; padding: 0 10px">  * @author Leonard Lin <leonard@acm.org>
</span><span class="cx" style="display: block; padding: 0 10px">  * @license GPL
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2469,32 +2470,74 @@
</span><span class="cx" style="display: block; padding: 0 10px">        // WP bug fix for LOVE <3 (and other situations with '<' before a number)
</span><span class="cx" style="display: block; padding: 0 10px">        $text = preg_replace( '#<([0-9]{1})#', '&lt;$1', $text );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        while ( preg_match( '/<(\/?[\w:]*)\s*([^>]*)>/', $text, $regex ) ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ /**
+        * Matches supported tags.
+        *
+        * To get the pattern as a string without the comments paste into a PHP
+        * REPL like `php -a`.
+        *
+        * @see https://html.spec.whatwg.org/#elements-2
+        * @see https://w3c.github.io/webcomponents/spec/custom/#valid-custom-element-name
+        *
+        * @example
+        * ~# php -a
+        * php > $s = [paste copied contents of expression below including parentheses];
+        * php > echo $s;
+        */
+       $tag_pattern = (
+               '#<' . // Start with an opening bracket.
+               '(/?)' . // Group 1 - If it's a closing tag it'll have a leading slash.
+               '(' . // Group 2 - Tag name.
+                       // Custom element tags have more lenient rules than HTML tag names.
+                       '(?:[a-z](?:[a-z0-9._]*)-(?:[a-z0-9._-]+)+)' .
+                               '|' .
+                       // Traditional tag rules approximate HTML tag names.
+                       '(?:[\w:]+)' .
+               ')' .
+               '(?:' .
+                       // We either immediately close the tag with its '>' and have nothing here.
+                       '\s*' .
+                       '(/?)' . // Group 3 - "attributes" for empty tag.
+                               '|' .
+                       // Or we must start with space characters to separate the tag name from the attributes (or whitespace).
+                       '(\s+)' . // Group 4 - Pre-attribute whitespace.
+                       '([^>]*)' . // Group 5 - Attributes.
+               ')' .
+               '>#' // End with a closing bracket.
+       );
+
+       while ( preg_match( $tag_pattern, $text, $regex ) ) {
+               $full_match        = $regex[0];
+               $has_leading_slash = ! empty( $regex[1] );
+               $tag_name          = $regex[2];
+               $tag               = strtolower( $tag_name );
+               $is_single_tag     = in_array( $tag, $single_tags, true );
+               $pre_attribute_ws  = isset( $regex[4] ) ? $regex[4] : '';
+               $attributes        = trim( isset( $regex[5] ) ? $regex[5] : $regex[3] );
+               $has_self_closer   = '/' === substr( $attributes, -1 );
+
</ins><span class="cx" style="display: block; padding: 0 10px">                 $newtext .= $tagqueue;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                $i = strpos( $text, $regex[0] );
-               $l = strlen( $regex[0] );
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         $i = strpos( $text, $full_match );
+               $l = strlen( $full_match );
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                // clear the shifter
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         // Clear the shifter.
</ins><span class="cx" style="display: block; padding: 0 10px">                 $tagqueue = '';
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                // Pop or Push
-               if ( isset( $regex[1][0] ) && '/' == $regex[1][0] ) { // End Tag
-                       $tag = strtolower( substr( $regex[1], 1 ) );
-                       // if too many closing tags
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         if ( $has_leading_slash ) { // End Tag.
+                       // If too many closing tags.
</ins><span class="cx" style="display: block; padding: 0 10px">                         if ( $stacksize <= 0 ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                $tag = '';
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                // or close to be safe $tag = '/' . $tag;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         // Or close to be safe $tag = '/' . $tag.
</ins><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                // if stacktop value = tag close value then pop
-                       } elseif ( $tagstack[ $stacksize - 1 ] == $tag ) { // found closing tag
-                               $tag = '</' . $tag . '>'; // Close Tag
-                               // Pop
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         // If stacktop value = tag close value, then pop.
+                       } elseif ( $tagstack[ $stacksize - 1 ] === $tag ) { // Found closing tag.
+                               $tag = '</' . $tag . '>'; // Close Tag.
</ins><span class="cx" style="display: block; padding: 0 10px">                                 array_pop( $tagstack );
</span><span class="cx" style="display: block; padding: 0 10px">                                $stacksize--;
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        } else { // closing tag not at top, search for it
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 } else { // Closing tag not at top, search for it.
</ins><span class="cx" style="display: block; padding: 0 10px">                                 for ( $j = $stacksize - 1; $j >= 0; $j-- ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                        if ( $tagstack[ $j ] == $tag ) {
-                                               // add tag to tagqueue
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                                 if ( $tagstack[ $j ] === $tag ) {
+                                               // Add tag to tagqueue.
</ins><span class="cx" style="display: block; padding: 0 10px">                                                 for ( $k = $stacksize - 1; $k >= $j; $k-- ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                                        $tagqueue .= '</' . array_pop( $tagstack ) . '>';
</span><span class="cx" style="display: block; padding: 0 10px">                                                        $stacksize--;
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2504,25 +2547,19 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="cx" style="display: block; padding: 0 10px">                                $tag = '';
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                } else { // Begin Tag
-                       $tag = strtolower( $regex[1] );
-
-                       // Tag Cleaning
-
-                       // If it's an empty tag "< >", do nothing
-                       if ( '' == $tag ) {
-                               // do nothing
-                       } elseif ( substr( $regex[2], -1 ) == '/' ) { // ElseIf it presents itself as a self-closing tag...
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         } else { // Begin Tag.
+                       if ( $has_self_closer ) { // If it presents itself as a self-closing tag...
</ins><span class="cx" style="display: block; padding: 0 10px">                                 // ...but it isn't a known single-entity self-closing tag, then don't let it be treated as such and
</span><span class="cx" style="display: block; padding: 0 10px">                                // immediately close it with a closing tag (the tag will encapsulate no text as a result)
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                                if ( ! in_array( $tag, $single_tags ) ) {
-                                       $regex[2] = trim( substr( $regex[2], 0, -1 ) ) . "></$tag";
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                         if ( ! $is_single_tag ) {
+                                       $attributes = trim( substr( $attributes, 0, -1 ) ) . "></$tag";
</ins><span class="cx" style="display: block; padding: 0 10px">                                 }
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        } elseif ( in_array( $tag, $single_tags ) ) { // ElseIf it's a known single-entity tag but it doesn't close itself, do so
-                               $regex[2] .= '/';
-                       } else { // Else it's not a single-entity tag
-                               // If the top of the stack is the same as the tag we want to push, close previous tag
-                               if ( $stacksize > 0 && ! in_array( $tag, $nestable_tags ) && $tagstack[ $stacksize - 1 ] == $tag ) {
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 } elseif ( $is_single_tag ) { // ElseIf it's a known single-entity tag but it doesn't close itself, do so
+                               $pre_attribute_ws = ' ';
+                               $attributes      .= '/';
+                       } else { // It's not a single-entity tag.
+                               // If the top of the stack is the same as the tag we want to push, close previous tag.
+                               if ( $stacksize > 0 && ! in_array( $tag, $nestable_tags, true ) && $tagstack[ $stacksize - 1 ] === $tag ) {
</ins><span class="cx" style="display: block; padding: 0 10px">                                         $tagqueue = '</' . array_pop( $tagstack ) . '>';
</span><span class="cx" style="display: block; padding: 0 10px">                                        $stacksize--;
</span><span class="cx" style="display: block; padding: 0 10px">                                }
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2529,14 +2566,14 @@
</span><span class="cx" style="display: block; padding: 0 10px">                                $stacksize = array_push( $tagstack, $tag );
</span><span class="cx" style="display: block; padding: 0 10px">                        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        // Attributes
-                       $attributes = $regex[2];
-                       if ( ! empty( $attributes ) && $attributes[0] != '>' ) {
-                               $attributes = ' ' . $attributes;
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 // Attributes.
+                       if ( $has_self_closer && $is_single_tag ) {
+                               // We need some space - avoid <br/> and prefer <br />.
+                               $pre_attribute_ws = ' ';
</ins><span class="cx" style="display: block; padding: 0 10px">                         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                        $tag = '<' . $tag . $attributes . '>';
-                       //If already queuing a close tag, then put this tag on, too
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                 $tag = '<' . $tag . $pre_attribute_ws . $attributes . '>';
+                       // If already queuing a close tag, then put this tag on too.
</ins><span class="cx" style="display: block; padding: 0 10px">                         if ( ! empty( $tagqueue ) ) {
</span><span class="cx" style="display: block; padding: 0 10px">                                $tagqueue .= $tag;
</span><span class="cx" style="display: block; padding: 0 10px">                                $tag       = '';
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -2546,18 +2583,17 @@
</span><span class="cx" style="display: block; padding: 0 10px">                $text     = substr( $text, $i + $l );
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        // Clear Tag Queue
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ // Clear Tag Queue.
</ins><span class="cx" style="display: block; padding: 0 10px">         $newtext .= $tagqueue;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        // Add Remaining text
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ // Add remaining text.
</ins><span class="cx" style="display: block; padding: 0 10px">         $newtext .= $text;
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        // Empty Stack
</del><span class="cx" style="display: block; padding: 0 10px">         while ( $x = array_pop( $tagstack ) ) {
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-                $newtext .= '</' . $x . '>'; // Add remaining tags to close
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         $newtext .= '</' . $x . '>'; // Add remaining tags to close.
</ins><span class="cx" style="display: block; padding: 0 10px">         }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><del style="background-color: #fdd; text-decoration:none; display:block; padding: 0 10px">-        // WP fix for the bug with HTML comments
</del><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+ // WP fix for the bug with HTML comments.
</ins><span class="cx" style="display: block; padding: 0 10px">         $newtext = str_replace( '< !--', '<!--', $newtext );
</span><span class="cx" style="display: block; padding: 0 10px">        $newtext = str_replace( '<    !--', '< !--', $newtext );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span></span></pre></div>
<a id="trunktestsphpunittestsformattingbalanceTagsphp"></a>
<div class="modfile"><h4 style="background-color: #eee; color: inherit; margin: 1em 0; padding: 1.3em; font-size: 115%">Modified: trunk/tests/phpunit/tests/formatting/balanceTags.php</h4>
<pre class="diff"><span>
<span class="info" style="display: block; padding: 0 10px; color: #888">--- trunk/tests/phpunit/tests/formatting/balanceTags.php      2019-09-02 02:26:55 UTC (rev 45928)
+++ trunk/tests/phpunit/tests/formatting/balanceTags.php        2019-09-02 10:24:18 UTC (rev 45929)
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -37,7 +37,159 @@
</span><span class="cx" style="display: block; padding: 0 10px">                );
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+        function supported_traditional_tag_names() {
+               return array(
+                       array( 'a' ),
+                       array( 'div' ),
+                       array( 'blockquote' ),
+                       // HTML tag names can be CAPITALIZED and are case-insensitive.
+                       array( 'A' ),
+                       array( 'dIv' ),
+                       array( 'BLOCKQUOTE' ),
+               );
+       }
+
+       function supported_custom_element_tag_names() {
+               return array(
+                       array( 'custom-element' ),
+                       array( 'my-custom-element' ),
+                       array( 'weekday-5-item' ),
+                       array( 'a-big-old-tag-name' ),
+                       array( 'with_underscores-and_the_dash' ),
+                       array( 'a-.' ),
+                       array( 'a._-.-_' ),
+               );
+       }
+
+       function invalid_tag_names() {
+               return array(
+                       array( '<0-day>inside', '&lt;0-day>inside' ), // Can't start with a number - handled by the "<3" fix.
+                       array( '<UPPERCASE-TAG>inside', '<UPPERCASE-TAG>inside' ), // Custom elements cannot be uppercase.
+               );
+       }
+
</ins><span class="cx" style="display: block; padding: 0 10px">         /**
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+         * These are valid custom elements but we don't support them yet.
+        *
+        * @see https://w3c.github.io/webcomponents/spec/custom/#valid-custom-element-name
+        */
+       function unsupported_valid_tag_names() {
+               return array(
+                       // We don't allow ending in a dash.
+                       array( '<what->inside' ),
+                       // Examples from the spec working document.
+                       array( 'math-α' ),
+                       array( 'emotion-😍' ),
+                       // UNICODE ranges
+                       // 0x00b7
+                       array( 'b-·' ),
+                       // Latin characters with accents/modifiers.
+                       // 0x00c0-0x00d6
+                       // 0x00d8-0x00f6
+                       array( 'a-À-Ó-Ý' ),
+                       // 0x00f8-0x037d
+                       array( 'a-ͳ' ),
+                       // No 0x037e, which is a Greek semicolon.
+                       // 0x037f-0x1fff
+                       array( 'a-Ფ' ),
+                       // Zero-width characters, probably never supported.
+                       // 0x200c-0x200d
+                       array( 'a-‌to-my-left-is-a-zero-width-non-joiner-do-not-delete-it' ),
+                       array( 'a-‍to-my-left-is-a-zero-width-joiner-do-not-delete-it' ),
+                       // Ties.
+                       // 0x203f-0x2040
+                       array( 'under-‿-tie' ),
+                       array( 'over-⁀-tie' ),
+                       // 0x2170-0x218f
+                       array( 'a-⁰' ),
+                       array( 'a-⅀' ),
+                       array( 'tag-ↀ-it' ),
+                       // 0x2c00-0x2fef
+                       array( 'a-Ⰰ' ),
+                       array( 'b-ⴓ-c' ),
+                       array( 'd-⽗' ),
+                       // 0x3001-0xd7ff
+                       array( 'a-、' ),
+                       array( 'z-态' ),
+                       array( 'a-送-䠺-ퟱ-퟿' ),
+                       // 0xf900-0xfdcf
+                       array( 'a-豈' ),
+                       array( 'my-切' ),
+                       array( 'aﴀ-tag' ),
+                       array( 'my-﷌' ),
+                       // 0xfdf0-0xfffd
+                       array( 'a-ﷰ' ),
+                       array( 'a-￰-￸-�' ), // Warning; blank characters are in there.
+                       // Extended ranges.
+                       // 0x10000-0xeffff
+                       array( 'a-𐀀' ),
+                       array( 'my-𝀀' ),
+                       array( 'a𞀀-𜿐' ),
+               );
+       }
+
+       /**
+        * These are invalid custom elements but we support them right now in order to keep the parser simpler.
+        *
+        * @see https://w3c.github.io/webcomponents/spec/custom/#valid-custom-element-name
+        */
+       function supported_invalid_tag_names() {
+               return array(
+                       // Reserved names for custom elements.
+                       array( 'annotation-xml' ),
+                       array( 'color-profile' ),
+                       array( 'font-face' ),
+                       array( 'font-face-src' ),
+                       array( 'font-face-uri' ),
+                       array( 'font-face-format' ),
+                       array( 'font-face-name' ),
+                       array( 'missing-glyph' ),
+               );
+       }
+
+       /**
+        * @ticket 47014
+        * @dataProvider supported_traditional_tag_names
+        */
+       function test_detects_traditional_tag_names( $tag ) {
+               $normalized = strtolower( $tag );
+
+               $this->assertEquals( "<$normalized>inside</$normalized>", balanceTags( "<$tag>inside", true ) );
+       }
+
+       /**
+        * @ticket 47014
+        * @dataProvider supported_custom_element_tag_names
+        */
+       function test_detects_supported_custom_element_tag_names( $tag ) {
+               $this->assertEquals( "<$tag>inside</$tag>", balanceTags( "<$tag>inside", true ) );
+       }
+
+       /**
+        * @ticket 47014
+        * @dataProvider invalid_tag_names
+        */
+       function test_ignores_invalid_tag_names( $input, $output ) {
+               $this->assertEquals( $output, balanceTags( $input, true ) );
+       }
+
+       /**
+        * @ticket 47014
+        * @dataProvider unsupported_valid_tag_names
+        */
+       function test_ignores_unsupported_custom_tag_names( $tag ) {
+               $this->assertEquals( "<$tag>inside", balanceTags( "<$tag>inside", true ) );
+       }
+
+       /**
+        * @ticket 47014
+        * @dataProvider supported_invalid_tag_names
+        */
+       function test_detects_supported_invalid_tag_names( $tag ) {
+               $this->assertEquals( "<$tag>inside</$tag>", balanceTags( "<$tag>inside", true ) );
+       }
+
+       /**
</ins><span class="cx" style="display: block; padding: 0 10px">          * If a recognized valid single tag appears unclosed, it should get self-closed
</span><span class="cx" style="display: block; padding: 0 10px">         *
</span><span class="cx" style="display: block; padding: 0 10px">         * @ticket 1597
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -68,6 +220,7 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        '<em />',
</span><span class="cx" style="display: block; padding: 0 10px">                        '<p class="main1"/>',
</span><span class="cx" style="display: block; padding: 0 10px">                        '<p class="main2" />',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        '<STRONG/>',
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px">                $expected = array(
</span><span class="cx" style="display: block; padding: 0 10px">                        '<strong></strong>',
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -74,6 +227,8 @@
</span><span class="cx" style="display: block; padding: 0 10px">                        '<em></em>',
</span><span class="cx" style="display: block; padding: 0 10px">                        '<p class="main1"></p>',
</span><span class="cx" style="display: block; padding: 0 10px">                        '<p class="main2"></p>',
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+                        // Valid tags are transformed to lowercase.
+                       '<strong></strong>',
</ins><span class="cx" style="display: block; padding: 0 10px">                 );
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><span class="cx" style="display: block; padding: 0 10px">                foreach ( $inputs as $key => $input ) {
</span><span class="lines" style="display: block; padding: 0 10px; color: #888">@@ -221,4 +376,68 @@
</span><span class="cx" style="display: block; padding: 0 10px">                }
</span><span class="cx" style="display: block; padding: 0 10px">        }
</span><span class="cx" style="display: block; padding: 0 10px"> 
</span><ins style="background-color: #dfd; text-decoration:none; display:block; padding: 0 10px">+        /**
+        * Get custom element data.
+        *
+        * @return array Data.
+        */
+       public function data_custom_elements() {
+               return array(
+                       // Valid custom element tags.
+                       array(
+                               '<my-custom-element data-attribute="value"/>',
+                               '<my-custom-element data-attribute="value"></my-custom-element>',
+                       ),
+                       array(
+                               '<my-custom-element>Test</my-custom-element>',
+                               '<my-custom-element>Test</my-custom-element>',
+                       ),
+                       array(
+                               '<my-custom-element>Test',
+                               '<my-custom-element>Test</my-custom-element>',
+                       ),
+                       array(
+                               'Test</my-custom-element>',
+                               'Test',
+                       ),
+                       array(
+                               '</my-custom-element>Test',
+                               'Test',
+                       ),
+                       array(
+                               '<my-custom-element/>',
+                               '<my-custom-element></my-custom-element>',
+                       ),
+                       array(
+                               '<my-custom-element />',
+                               '<my-custom-element></my-custom-element>',
+                       ),
+                       // Invalid (or at least temporarily unsupported) custom element tags.
+                       array(
+                               '<MY-CUSTOM-ELEMENT>Test',
+                               '<MY-CUSTOM-ELEMENT>Test',
+                       ),
+                       array(
+                               '<my->Test',
+                               '<my->Test',
+                       ),
+                       array(
+                               '<--->Test',
+                               '<--->Test',
+                       ),
+               );
+       }
+
+       /**
+        * Test custom elements.
+        *
+        * @ticket 47014
+        * @dataProvider data_custom_elements
+        *
+        * @param string $source   Source.
+        * @param string $expected Expected.
+        */
+       public function test_custom_elements( $source, $expected ) {
+               $this->assertEquals( $expected, balanceTags( $source, true ) );
+       }
</ins><span class="cx" style="display: block; padding: 0 10px"> }
</span></span></pre>
</div>
</div>

</body>
</html>