[wp-trac] [WordPress Trac] #35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle them

WordPress Trac noreply at wordpress.org
Mon Mar 21 22:38:17 UTC 2016


#35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle
them
--------------------------+-----------------------------
 Reporter:  steevithak    |       Owner:
     Type:  defect (bug)  |      Status:  assigned
 Priority:  normal        |   Milestone:  Future Release
Component:  Shortcodes    |     Version:  4.4
 Severity:  normal        |  Resolution:
 Keywords:  needs-patch   |     Focuses:
--------------------------+-----------------------------

Comment (by gitlost):

 Hi, I think a big issue here is backward compatibility. If one was specing
 shortcodes now it would make sense to restrict names drastically, similar
 to PHP variable names or whatever. But until
 [https://core.trac.wordpress.org/changeset/34745 this change] (see also
 the [https://make.wordpress.org/core/2015/09/29/shortcode-roadmap-draft-
 two/ Shortcode Road Map Draft Two] New Restriction on Shortcode Names and
 the codex [https://codex.wordpress.org/Shortcode_API#Names Shortcode API
 Names]) there wasn't any formal restriction on what characters could be
 used in shortcode names - things just didn't (or did) work.

 The restrictions introduced were pretty minimal, and it seems more in
 keeping to continue this minimalism, hence the suggestion just to add
 `\x{00A0}` (and perhaps `\x{200B}`) rather than use `\s`. (Note that the
 list of chars you linked to are actually matched by `\s` in UTF-8 mode,
 along with LF, VT, FF and CR. The quasi-whitespaces that aren't that one
 might think should be are `U+200B`,`U+200C`, `U+200D`, `U+2060` and
 `U+FEFF` - [https://en.wikipedia.org/wiki/Whitespace_character].)

 Similarly with switching to use `\w`, which also in UTF-8 mode could have
 performance issues in needing to look up large Unicode tables.

 Plus the particular use-case here is TinyMCE inserting `U+00A0` into
 content, so adding other stuff could be seen as creep. And besides
 personally I use the Mongolian Vowel Separator in all my shortcode names.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/35022#comment:20>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list