[wp-trac] [WordPress Trac] #35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle them
WordPress Trac
noreply at wordpress.org
Wed Apr 20 14:49:58 UTC 2016
#35022: WP allows Unicode 0x00a0 spaces in editor but shortcode parser can't handle
them
--------------------------+-----------------------------
Reporter: steevithak | Owner:
Type: defect (bug) | Status: assigned
Priority: normal | Milestone: Future Release
Component: Shortcodes | Version: 4.4
Severity: normal | Resolution:
Keywords: needs-patch | Focuses:
--------------------------+-----------------------------
Comment (by gitlost):
Discovered one can't rely on PCRE being installed with UTF-8 enabled. Also
the check for `U+00A0` should only happen when the charset is UTF-8.
(Apart from that the previous patch was fine.)
So the simplest thing I think is just to use PCRE in single-byte mode and
to only use the extended check when the blog charset is UTF-8, which the
above patch does with conditional defines (it now needs two each for a
positive and a negative match). Could add extra defines for particular
legacy charsets like latin1 if one wanted.
(I spent quite a bit of time trying to track down where TinyMCE was adding
the `\x00a0`'s, and (as any schoolboy knows) it turns out it doesn't -
it's the browser automatically adding ` `'s to `ContentEditable`
divs, which TinyMCE then encodes into `\x00a0`'s. Chrome adds alternate
"space space " etc while Firefox does " ...
space". IE does something similar to Firefox, but more microsofty. So
there you go...)
--
Ticket URL: <https://core.trac.wordpress.org/ticket/35022#comment:22>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list