[wp-trac] [WordPress Trac] #26842: Contenteditable, multiple spaces,  , and U+00A0
WordPress Trac
noreply at wordpress.org
Wed Jan 15 17:03:32 UTC 2014
#26842: Contenteditable, multiple spaces,  , and U+00A0
-------------------------+-----------------
Reporter: azaozz | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: 3.9
Component: TinyMCE | Version:
Severity: normal | Keywords:
-------------------------+-----------------
In contenteditable mode when the user types multiple spaces (ASCII char
32, U+0020) they are preserved. The browsers insert ` ` as every
other character, the string is ` ` etc.
In WordPress TinyMCE is set to
{{{
'entities' => '38,amp,60,lt,62,gt',
'entity_encoding' => 'raw',
}}}
Anything other than the three basic "htmlspecialchars" `&`, `<` and
`>` is outputted as UTF-8 when serializing the DOM. This outputs the
(multiple) ` ` as U+00A0 which in PHP shows as `0xC2
0xA0`([http://en.wikipedia.org/wiki/Non-breaking_space reference]).
A problem with `0xC2 0xA0` is that in PHP the regex `\s` matches `0xA0` in
certain cases, fails to match the "white space", breaks the UTF char, and
sometimes leaves an `Â` behind. One example is wptexturize(), see #22692.
Another problem is that the user is not aware there are multiple ` `
when looking in the Text editor or the html source, as U+00A0 are
"invisible".
--
Ticket URL: <https://core.trac.wordpress.org/ticket/26842>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list