[wp-trac] [WordPress Trac] #26094: sanitize_file_name() breaks some UTF-8 strings
WordPress Trac
noreply at wordpress.org
Sun Nov 17 21:25:14 UTC 2013
#26094: sanitize_file_name() breaks some UTF-8 strings
--------------------------+-----------------------------
Reporter: p_enrique | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Formatting | Version:
Severity: normal | Keywords:
--------------------------+-----------------------------
I've been testing sanitize_file_name( 'X.jpg' ) where X is an Unicode
character that is a number or a letter (matching regex `/[\p{L}\p{N}]/u`).
Alarmingly, there are many rather common characters that will result in a
malformed, broken string being returned:
{{{
(U+00E0) : à Latin small letter a with grave
(U+0160) : Š Latin capital letter s with caron
(U+03A0) : Π Greek capital letter pi
(U+0420) : Р Cyrillic capital letter er
}}}
The problem seems to be caused by the `preg_replace` function without a
Unicode pattern modifier.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/26094>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list