[wp-trac] [WordPress Trac] #55807: sanitize_file_name not sanitizing decomposed unicode when file is uploaded using Chrome and Firefox
WordPress Trac
noreply at wordpress.org
Thu Jun 16 22:39:38 UTC 2022
#55807: sanitize_file_name not sanitizing decomposed unicode when file is uploaded
using Chrome and Firefox
----------------------------+------------------------------
Reporter: christerolsson | Owner: (none)
Type: defect (bug) | Status: closed
Priority: normal | Milestone: Awaiting Review
Component: Charset | Version: 5.9.3
Severity: normal | Resolution: duplicate
Keywords: | Focuses:
----------------------------+------------------------------
Changes (by ironprogrammer):
* status: new => closed
* resolution: => duplicate
* component: Upload => Charset
Comment:
Thanks for your report, @christerolsson!
After some digging into the underlying `remove_accents()` function, this
appears to be a duplicate of #24661.
=== Rationale ===
In the case of this report, the filename provided included the following
3-byte sequence characters ("combined character" sequences):
{{{
äöå
# or (hex)
\x61\xcc\x88\x6f\xcc\x88\x61\xcc\x8a
}}}
In the case of Safari (v15.5), it normalizes the uploaded filename to
2-byte sequences. Then in `remove_accents()` (which contains an array of
2-byte characters for translation), the substitutions work as intended.
{{{
äöå
# or
\xc3\xa4\xc3\xb6\xc3\xa5
=> aoa
}}}
However, Chrome and Firefox do not normalize the filename, so the string
passed to the function retains the original 3-byte characters, which won't
get matched per the function at this time.
Please follow #24661 for continuation of this effort.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/55807#comment:4>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list