[wp-trac] [WordPress Trac] #38231: Allow download_url to respect content-disposition header

WordPress Trac noreply at wordpress.org
Tue Oct 5 21:52:23 UTC 2021


#38231: Allow download_url to respect content-disposition header
--------------------------------------+------------------------------
 Reporter:  cklosows                  |       Owner:  johnjamesjacoby
     Type:  enhancement               |      Status:  assigned
 Priority:  normal                    |   Milestone:  5.9
Component:  HTTP API                  |     Version:  4.7
 Severity:  normal                    |  Resolution:
 Keywords:  has-patch has-unit-tests  |     Focuses:
--------------------------------------+------------------------------

Comment (by johnjamesjacoby):

 After testing the patch and doing some research, I believe this will need
 a bit more work.

 (I've started that work, and will post an updated patch in the coming
 days.)

 Citations:
 * MDN: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-
 Disposition
 * RFC6266: https://datatracker.ietf.org/doc/html/rfc6266

 In short:

 * The `$header` parameter of `wp_remote_retrieve_header()` needs to be
 lowercase – I'm opening a new core ticket imminently to give this more
 thought
 * The regex needs tweaking to account for relative paths and the alternate
 `filename*` directive:
 > `filename`
 > Is followed by a string containing the original name of the file
 transmitted. The filename is always optional and must not be used blindly
 by the application: path information should be stripped, and conversion to
 the server file system rules should be done. This parameter provides
 mostly indicative information. When used in combination with Content-
 Disposition: attachment, it is used as the default filename for an
 eventual "Save As" dialog presented to the user.
 >
 > `filename*`
 > The parameters "filename" and "filename*" differ only in that
 "filename*" uses the encoding defined in RFC 5987. When both filename and
 filename* are present in a single header field value, filename* is
 preferred over filename when both are understood.
 >
 > **Warning: The string following filename should always be put into
 quotes; but, for compatibility reasons, many browsers try to parse
 unquoted names that contain spaces.**

 Possible implications from the RFC, to confirm with additional unit tests:

 {{{
    It is essential that recipients treat the specified filename as
    advisory only, and thus be very careful in extracting the desired
    information.  In particular:

    o  Recipients MUST NOT be able to write into any location other than
       one to which they are specifically entitled.  To illustrate the
       problem, consider the consequences of being able to overwrite
       well-known system locations (such as "/etc/passwd").  One strategy
       to achieve this is to never trust folder name information in the
       filename parameter, for instance by stripping all but the last
       path segment and only considering the actual filename (where 'path
       segments' are the components of the field value delimited by the
       path separator characters "\" and "/").

    o  Many platforms do not use Internet Media Types ([RFC2046]) to hold
       type information in the file system, but rely on filename
       extensions instead.  Trusting the server-provided file extension
       could introduce a privilege escalation when the saved file is
       later opened (consider ".exe").  Thus, recipients that make use of
       file extensions to determine the media type MUST ensure that a
       file extension is used that is safe, optimally matching the media
       type of the received payload.

    o  Recipients SHOULD strip or replace character sequences that are
       known to cause confusion both in user interfaces and in filenames,
       such as control characters and leading and trailing whitespace.

    o  Other aspects recipients need to be aware of are names that have a
       special meaning in the file system or in shell commands, such as
       "." and "..", "~", "|", and also device names.  Recipients SHOULD
       ignore or substitute names like these.

       Note: Many user agents do not properly handle the escape character
       "\" when using the quoted-string form.  Furthermore, some user
       agents erroneously try to perform unescaping of "percent" escapes
       (see Appendix C.2), and thus might misinterpret filenames
       containing the percent character followed by two hex digits.
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/38231#comment:10>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list