[wp-trac] [WordPress Trac] #35951: remove_accents() doesn't escape Unicode NFD characters
WordPress Trac
noreply at wordpress.org
Thu Feb 25 16:18:55 UTC 2016
#35951: remove_accents() doesn't escape Unicode NFD characters
---------------------------+-----------------------------
Reporter: onnimonni | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Charset | Version: 4.4.2
Severity: normal | Keywords:
Focuses: accessibility |
---------------------------+-----------------------------
OS X filesystem HFS uses unicode '''NFD''' instead of '''NFC'''. This
causes all sorts of problems when uploaded files with accents are moved
between environments or if the site is developed in OS X machine and then
pushed to production.
I'm trying to solve this problem using remove_accents() function and
sanitizing all uploaded files. But in my test machine `remove_accents()`
doesn't do anything for NFD characters.
It should use something like `Normalizer::normalize()` to avoid this. But
sadly Normalizer isn't available in all systems.
I included zip file which contains nfd characters. If you open it in linux
machine you can see a small difference between the characters and "normal"
utf-8 accented characters like: '''öäå'''.
Try to copy the contents and run it through `remove_accents('content')`
and you can see that nothing is changed.
If you have Normalizer available you can test that `remove_accent()` if
characters are first filtered by running Normalizer for example:
`remove_accents(Normalizer::normalize('content'))`
I realize this doesn't concern native english speaking countries but it's
really big annoyance for the rest of us.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/35951>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list