[wp-trac] [WordPress Trac] #55807: sanitize_file_name not sanitizing decomposed unicode when file is uploaded using Chrome and Firefox

WordPress Trac noreply at wordpress.org
Thu Jun 16 22:39:38 UTC 2022


#55807: sanitize_file_name not sanitizing decomposed unicode when file is uploaded
using Chrome and Firefox
----------------------------+------------------------------
 Reporter:  christerolsson  |       Owner:  (none)
     Type:  defect (bug)    |      Status:  closed
 Priority:  normal          |   Milestone:  Awaiting Review
Component:  Charset         |     Version:  5.9.3
 Severity:  normal          |  Resolution:  duplicate
 Keywords:                  |     Focuses:
----------------------------+------------------------------
Changes (by ironprogrammer):

 * status:  new => closed
 * resolution:   => duplicate
 * component:  Upload => Charset


Comment:

 Thanks for your report, @christerolsson!

 After some digging into the underlying `remove_accents()` function, this
 appears to be a duplicate of #24661.

 === Rationale ===
 In the case of this report, the filename provided included the following
 3-byte sequence characters ("combined character" sequences):
 {{{
 äöå
 # or (hex)
 \x61\xcc\x88\x6f\xcc\x88\x61\xcc\x8a
 }}}

 In the case of Safari (v15.5), it normalizes the uploaded filename to
 2-byte sequences. Then in `remove_accents()` (which contains an array of
 2-byte characters for translation), the substitutions work as intended.

 {{{
 äöå
 # or
 \xc3\xa4\xc3\xb6\xc3\xa5

 => aoa
 }}}

 However, Chrome and Firefox do not normalize the filename, so the string
 passed to the function retains the original 3-byte characters, which won't
 get matched per the function at this time.

 Please follow #24661 for continuation of this effort.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/55807#comment:4>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list