[wp-trac] [WordPress Trac] #14347: URLs are not handeled properly
WordPress Trac
wp-trac at lists.automattic.com
Sun Jul 18 19:00:45 UTC 2010
#14347: URLs are not handeled properly
--------------------------+-------------------------------------------------
Reporter: hakre | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version:
Severity: normal | Keywords:
--------------------------+-------------------------------------------------
While digging into #14201, #14292 and similars, it came to my attention,
that wordpress does not filter the URL input properly. This can lead to
404 responses where content is actually available as specified by http /
RFC 2612.
Example run against current trunk to illustrate the issue:
{{{
# curl -I http://webroot.loc/wordpress/tag/%e4%b8%80%e6%a0%b7
HTTP/1.1 200 OK
Date: Sun, 18 Jul 2010 18:53:02 GMT
Server: Apache
X-Pingback: http://webroot.loc/wordpress/xmlrpc.php
Content-Type: text/html; charset=UTF-8
}}}
Doing the ''same'' request with an alternative writing in the URL does
lead to a 404. Remind that the "a" of tag has been encoded as %41:
{{{
# curl -I http://webroot.loc/wordpress/t%41g/%e4%b8%80%e6%a0%b7
HTTP/1.1 404 Not Found
Date: Sun, 18 Jul 2010 18:54:32 GMT
Server: Apache
Cache-Control: no-cache, must-revalidate, max-age=0
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Pragma: no-cache
X-Pingback: http://webroot.loc/wordpress/xmlrpc.php
Last-Modified: Sun, 18 Jul 2010 18:54:33 GMT
Content-Type: text/html; charset=UTF-8
}}}
RFC 2613 clearly write about this in the comparison of URLs (3.2.3):
> Characters other than those in the "reserved" and "unsafe" sets (see
> RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.
These so called character triplets are written uppercase by the PHP
urlencode() and rawurlencode() functions, are written lowercase mostly
inside worpdress (e.g. slugs generation). They can be written either and
even mixed case, even the RFCs introduce them uppercase first. But both
variants are okay, even {{{%dD}}} is.
The webapplication should handle both URLs the same.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/14347>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list