[wp-trac] [WordPress Trac] #35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle them

WordPress Trac noreply at wordpress.org
Sat Mar 19 04:25:10 UTC 2016


#35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle
them
--------------------------+-----------------------------
 Reporter:  steevithak    |       Owner:
     Type:  defect (bug)  |      Status:  assigned
 Priority:  normal        |   Milestone:  Future Release
Component:  Shortcodes    |     Version:  4.4
 Severity:  normal        |  Resolution:
 Keywords:  needs-patch   |     Focuses:
--------------------------+-----------------------------

Comment (by gitlost):

 @boonebgorges the non-breaking space should be in UTF-8, `\xC2\xA0`, is
 why the `preg_replace()` in `shortcode_parse_atts()` is failing.

 @steevithak `wptexturize()` in "wp-includes/formatting.php" is subbing the
 quote characters, as it uses its own copy of the shortcode tagname regex
 so isn't recognizing the shortcode.

 I'll upload versions of the patches with those changes, using a `define`
 to put the terminators in one place. Also I think it might be better just
 to add `\x{00A0}` explicitly rather than use `\s`, to be conservative.
 There might also be a case to also add zero width space `\x{200B}` which
 is mentioned in other tests in "tests/phpunit/tests/shortcode.php" (and
 isn't covered by `\s`).

 An alternative to changing to UTF-8 mode would be to stay in single-byte
 mode and add `\xC2` to the terminators. This would rule out all characters
 from `U+0080` to `U+00BF`
 ([http://www.fileformat.info/info/charset/UTF-8/list.htm]) but they seem
 to be controls and/or symbols so probably unlikely to be used in shortcode
 tagnames.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/35022#comment:18>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list