[wp-trac] [WordPress Trac] #38231: Allow download_url to respect content-disposition header
WordPress Trac
noreply at wordpress.org
Tue Oct 5 21:52:23 UTC 2021
#38231: Allow download_url to respect content-disposition header
--------------------------------------+------------------------------
Reporter: cklosows | Owner: johnjamesjacoby
Type: enhancement | Status: assigned
Priority: normal | Milestone: 5.9
Component: HTTP API | Version: 4.7
Severity: normal | Resolution:
Keywords: has-patch has-unit-tests | Focuses:
--------------------------------------+------------------------------
Comment (by johnjamesjacoby):
After testing the patch and doing some research, I believe this will need
a bit more work.
(I've started that work, and will post an updated patch in the coming
days.)
Citations:
* MDN: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-
Disposition
* RFC6266: https://datatracker.ietf.org/doc/html/rfc6266
In short:
* The `$header` parameter of `wp_remote_retrieve_header()` needs to be
lowercase – I'm opening a new core ticket imminently to give this more
thought
* The regex needs tweaking to account for relative paths and the alternate
`filename*` directive:
> `filename`
> Is followed by a string containing the original name of the file
transmitted. The filename is always optional and must not be used blindly
by the application: path information should be stripped, and conversion to
the server file system rules should be done. This parameter provides
mostly indicative information. When used in combination with Content-
Disposition: attachment, it is used as the default filename for an
eventual "Save As" dialog presented to the user.
>
> `filename*`
> The parameters "filename" and "filename*" differ only in that
"filename*" uses the encoding defined in RFC 5987. When both filename and
filename* are present in a single header field value, filename* is
preferred over filename when both are understood.
>
> **Warning: The string following filename should always be put into
quotes; but, for compatibility reasons, many browsers try to parse
unquoted names that contain spaces.**
Possible implications from the RFC, to confirm with additional unit tests:
{{{
It is essential that recipients treat the specified filename as
advisory only, and thus be very careful in extracting the desired
information. In particular:
o Recipients MUST NOT be able to write into any location other than
one to which they are specifically entitled. To illustrate the
problem, consider the consequences of being able to overwrite
well-known system locations (such as "/etc/passwd"). One strategy
to achieve this is to never trust folder name information in the
filename parameter, for instance by stripping all but the last
path segment and only considering the actual filename (where 'path
segments' are the components of the field value delimited by the
path separator characters "\" and "/").
o Many platforms do not use Internet Media Types ([RFC2046]) to hold
type information in the file system, but rely on filename
extensions instead. Trusting the server-provided file extension
could introduce a privilege escalation when the saved file is
later opened (consider ".exe"). Thus, recipients that make use of
file extensions to determine the media type MUST ensure that a
file extension is used that is safe, optimally matching the media
type of the received payload.
o Recipients SHOULD strip or replace character sequences that are
known to cause confusion both in user interfaces and in filenames,
such as control characters and leading and trailing whitespace.
o Other aspects recipients need to be aware of are names that have a
special meaning in the file system or in shell commands, such as
"." and "..", "~", "|", and also device names. Recipients SHOULD
ignore or substitute names like these.
Note: Many user agents do not properly handle the escape character
"\" when using the quoted-string form. Furthermore, some user
agents erroneously try to perform unescaping of "percent" escapes
(see Appendix C.2), and thus might misinterpret filenames
containing the percent character followed by two hex digits.
}}}
--
Ticket URL: <https://core.trac.wordpress.org/ticket/38231#comment:10>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list