[wp-trac] [WordPress Trac] #60385: HTML API: Text nodes may be incorrectly split

WordPress Trac noreply at wordpress.org
Tue Feb 6 19:21:51 UTC 2024


#60385: HTML API: Text nodes may be incorrectly split
--------------------------------------+-------------------------
 Reporter:  jonsurrell                |       Owner:  jonsurrell
     Type:  defect (bug)              |      Status:  closed
 Priority:  normal                    |   Milestone:  6.5
Component:  HTML API                  |     Version:  trunk
 Severity:  normal                    |  Resolution:  fixed
 Keywords:  has-patch has-unit-tests  |     Focuses:
--------------------------------------+-------------------------
Changes (by dmsnell):

 * status:  reopened => closed
 * resolution:   => fixed


Comment:

 In [changeset:"57542" 57542]:
 {{{
 #!CommitTicketReference repository="" revision="57542"
 HTML API: Join text nodes on invalid-tag-name boundaries.

 A fix was introduced to the Tag Processor to ensure that contiguous text
 in an HTML document emerges as a single text node spanning the full
 sequence. Unfortunately, that patch was marginally over-zealous in
 checking if a "<" started a syntax token or not. It used the following:

 {{{
 <?php
 if ( 'A' <= $c && 'z' >= $c ) { ... }
 }}}

 This was based on the assumption that the A-Z and a-z letters are
 contiguous in the ASCII range; they aren't, and there's a gap of
 several characters in between. The result of this is that in some
 cases the parser created a text boundary when it didn't need to.
 Text boundaries can be surprising and can be created when reaching
 invalid syntax, HTML comments, and more hidden elements, so
 semantically this wasn't a major bug, but it was an aesthetic
 challenge.

 In this patch the check is properly compared for both upper- and
 lower-case variants that could potentially form tag names.

 {{{
 <?php
 if ( ( 'A' <= $c && 'Z' >= $c ) || ( 'a' <= $c && 'z' >= $c ) ) { ... }
 }}}

 This solves the problem and ensures that contiguous text appears
 as a single text node when scanning tokens.

 Developed in https://github.com/WordPress/wordpress-develop/pull/6041
 Discussed in https://core.trac.wordpress.org/ticket/60385

 Follow-up to [57489]
 Props dmsnell, jonsurrell
 Fixes #60385
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/60385#comment:6>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list