[wp-trac] [WordPress Trac] #31645: Press This: Reject relative URLs when scraping source html
WordPress Trac
noreply at wordpress.org
Tue Mar 17 19:35:13 UTC 2015
#31645: Press This: Reject relative URLs when scraping source html
--------------------------+--------------------
Reporter: kraftbj | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: 4.2
Component: Press This | Version: trunk
Severity: normal | Resolution:
Keywords: has-patch | Focuses:
--------------------------+--------------------
Comment (by azaozz):
Replying to [comment:10 stephdau]:
> [attachment:31645.6.patch] will not work as is because `$url` goes
through `esc_url_raw()` before being tested, which prepends `http://` to
whatever value is passed to it... So `123.html` becomes
`http://123.html/`.
Right, we have to run that test before `esc_url_raw()` as it will prepend
`http://` to some relative URLs.
Looking at 31645.7.patch:
- `'/^[\/]{1}[^\/]+/'` is exactly the same as `'%^/[^/]+%'` and `$url{0}
=== '/' && $url{1} !== '/'` with the last one being much faster than any
PCRE function.
- This `'/^[\/]{2}[^\/]+/'` matches protocol-relative URLs, then we
prepend the current protocol to them. Not sure this is desirable. URLs
starting with `//` are the best choice for links, embeds, images, etc. (as
long as the server supports both http and https). They will never trigger
"Mixed/Insecure content" warnings. If the page is telling us to use these,
we should :)
- At the end we just return the whole URL passed by the user? So if there
is an image src `../../assets/images/test.gif` we will replace the src
with the page's URL. We should be rejecting non-root relative URLs.
We can try to extrapolate the absolute URL out of the page's URL and an
relative image src. We can attempt that or discard relative image sources.
Shouldn't return a wrong src though.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/31645#comment:13>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list