[wp-trac] [WordPress Trac] #44793: remove_accents() doesnt escape all versions of "i"
WordPress Trac
noreply at wordpress.org
Thu Mar 21 21:28:01 UTC 2019
#44793: remove_accents() doesnt escape all versions of "i"
-------------------------------------------------+-------------------------
Reporter: bagosm | Owner:
| SergeyBiryukov
Type: defect (bug) | Status: reviewing
Priority: normal | Milestone: 5.3
Component: Formatting | Version:
Severity: normal | Resolution:
Keywords: has-patch dev-feedback needs- | Focuses:
testing |
-------------------------------------------------+-------------------------
Comment (by xkon):
Hey there!
I just noticed this ticket as it was turned into a Future Release (good
call imho as it might need a lot more discussion as well!). I can handle
the Greek letters but I need some clarifications first please because I'm
not sure where exactly `remove_accents()` is needed & used for. I saw in
core it's within `sanitize_title()` and `sanitize_user()` but it might be
on various other places or used in more broader scopes as well that I
can't know at the moment.
Please bear with me and let me explain my thinking process and "issues"
before creating a patch and if there's an outcome on what's actually
needed I'll be more than happy to provide a patch for Greek letters.
The function itself is called "remove_accents()" that literally means
removing accents. The description though in our Handbook says `Converts
all accent characters to ASCII characters.` and this is something totally
different, in many languages removing an accent and converting to ASCII
(Latin) is whole different story and changes everything.
For example:
**Removing accent ( literally so both are Greek letters ):** ί = ι
**Converting to ASCII ( so altering the locale ):** ί = i
Questions:
**Where would `remove_accents()` be actually used and for what purposes?**
As for example if it's used to create "slugs" or titles if we only add
accented characters people will end up with a mixed slug or title by
having Greek/Latin letters. So most likely in this case a "full" exchange
of Greek -> Latin would be needed.
**What is actually needed here?**
If it's simply changing from an accented Greek letter to a non accented
one it would be ok (I believe). If it's to remove accents for another
reason and it's used to try and convert everything into Latin maybe we
need to split things up into `remove_accents()` and `convert_to_latin()`
for example by introducing something new (if something equivalent doesn't
already exist that I'm not aware of).
For me this isn't something as simple as it looks as for various Languages
removing accents (if used to alter actual readable text) ends up on
changing the meaning of the word as well.
Thanks!
--
Ticket URL: <https://core.trac.wordpress.org/ticket/44793#comment:19>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list