[wp-trac] [WordPress Trac] #35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle them
WordPress Trac
noreply at wordpress.org
Sat Mar 19 04:25:10 UTC 2016
#35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle
them
--------------------------+-----------------------------
Reporter: steevithak | Owner:
Type: defect (bug) | Status: assigned
Priority: normal | Milestone: Future Release
Component: Shortcodes | Version: 4.4
Severity: normal | Resolution:
Keywords: needs-patch | Focuses:
--------------------------+-----------------------------
Comment (by gitlost):
@boonebgorges the non-breaking space should be in UTF-8, `\xC2\xA0`, is
why the `preg_replace()` in `shortcode_parse_atts()` is failing.
@steevithak `wptexturize()` in "wp-includes/formatting.php" is subbing the
quote characters, as it uses its own copy of the shortcode tagname regex
so isn't recognizing the shortcode.
I'll upload versions of the patches with those changes, using a `define`
to put the terminators in one place. Also I think it might be better just
to add `\x{00A0}` explicitly rather than use `\s`, to be conservative.
There might also be a case to also add zero width space `\x{200B}` which
is mentioned in other tests in "tests/phpunit/tests/shortcode.php" (and
isn't covered by `\s`).
An alternative to changing to UTF-8 mode would be to stay in single-byte
mode and add `\xC2` to the terminators. This would rule out all characters
from `U+0080` to `U+00BF`
([http://www.fileformat.info/info/charset/UTF-8/list.htm]) but they seem
to be controls and/or symbols so probably unlikely to be used in shortcode
tagnames.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/35022#comment:18>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list