[wp-trac] [WordPress Trac] #62995: Uploading Mac screenshots results in broken images, due to question marks inserted in filenames
WordPress Trac
noreply at wordpress.org
Thu Jun 26 16:14:33 UTC 2025
#62995: Uploading Mac screenshots results in broken images, due to question marks
inserted in filenames
-------------------------------+------------------------------
Reporter: room34 | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Media | Version:
Severity: normal | Resolution:
Keywords: reporter-feedback | Focuses: administration
-------------------------------+------------------------------
Comment (by dmsnell):
> For obvious reasons, there should never be a question mark in filenames
in the Media Library.
As a side mention, I would pose that it’s not obvious that questions marks
should be disallowed, or any character other than the ASCII “/”, since
that is used across various filesystems as the path separator.
HTML has always supported URLs with arbitrary characters in them, so I
think there are two issues at play.
**The media library is not properly forming encoded URLs.**
Even if we resolve the name transformation in `sanitize_file_name()` we
would want to ensure that we build the proper URLs when rendering HTML to
the browser. The proper way to create a URL for a filename with a question
mark in it is to percent-escape it.
{{{#!javascript
u = new URL( 'https://wordpress.org' );
u.pathname = '/wp-content/uploads/Screenshot 2025-02-19 at 2.17.33
PM.png';
u.href === 'https://wordpress.org/wp-
content/uploads/Screenshot%202025-02-19%20at%202.17.33%E2%80%AFPM.png'
}}}
We can see how the browser is implementing the WHATWG URL spec for
[https://url.spec.whatwg.org/#url-path-segment-string URL-path-segment
string] and "[https://url.spec.whatwg.org/#percent-encoded-byte
convert(ing) to percent-encoded bytes]" the "Code points greater than
U+007F."
There is some good news here. There have been some recent developments to
add spec-compliant URL handling into PHP itself, and many of WordPress’
needs were incorporated into that work. For now we will start with a URL
parser, but hopefully in the coming releases we will see added
functionality for building URLs (because currently and previous to this
new work, PHP has never had tools in its standard library to properly
handle URLs).
This is its own bug and we could fix it today.
**Filenames could be more conservatively renamed to avoid problems in
buggy code elsewhere.**
That being said, like most things I’ve encountered in WordPress where text
or HTML encoding comes into play, there are likely a number of places that
do not properly handle these valid filenames or URLs.
One thing I like about the wording in some of the linked tickets is the
intentionality of the transformations. Something like
`shareable_filename()` could communicate the intent to produce a filename
which is more resilient to systems which apply their own non-standard
rules. Each transformation inside that function can be linked to a
particular known system or rule of thumb…
- Facebook doesn’t allow…
- AppEngine rejects control points…
- Many HTML systems will trip up on question mark or hash
sign/pound/octothorpe
Maybe I’ll doodle on some of this, but I think it would be helpful for us
to consider both parts of why this is broken, as one is uncontroversially
broken and the fix is clear (produce spec-compliant URLs to avoid sending
browsers the wrong URL) while the other is more subjective and includes
WordPress extending known specifications with its own rules.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/62995#comment:10>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list