[wp-trac] [WordPress Trac] #52865: Strip 'enclosed' trailing spaces in URLs

WordPress Trac noreply at wordpress.org
Fri Mar 19 10:42:06 UTC 2021


#52865: Strip 'enclosed' trailing spaces in URLs
----------------------------+------------------------------
 Reporter:  jonoaldersonwp  |       Owner:  (none)
     Type:  enhancement     |      Status:  new
 Priority:  low             |   Milestone:  Awaiting Review
Component:  Canonical       |     Version:
 Severity:  normal          |  Resolution:
 Keywords:  seo             |     Focuses:  performance
----------------------------+------------------------------
Changes (by SergeyBiryukov):

 * keywords:  seo performance => seo
 * focuses:   => performance


Old description:

> https://core.trac.wordpress.org/ticket/20383 made improvements that strip
> trailing punctuation from URLs. E.g., https://ma.tt/2012/03/productivity-
> per-square-inch%20 redirects correctly to the canonical URL.
>
> However, URLs like https://ma.tt/2012/03/productivity-per-square-inch%20/
> (which 'enclose' the trailing space with a trailing slash) are ''not''
> redirected. It, and others like it, typically return a 404 error.
>
> This kind of 'broken link' pattern is ''extremely'' common on the web;
> particular as a trailing slash is often appended to a malformed URL
> ''before'' WP runs (e.g., via a server/htaccess/nginx configuration).
>
> We should refine the canonical redirect logic (in `redirect_canonical`)
> to also consider and redirect these types of requests.
>
> **Considerations**
> - The "''Remove trailing spaces and end punctuation from the path''"
> section of `redirect_canonical` doesn't consider the presence of trailing
> slashes in the URL. This could/should be adapted to catch those.
>
> - There might be cases where a user 'legitimately' has a permalink
> structure (or slug) that ends in `%20` or `%20/`. That might(?) make a
> fix more complicated than just sniffing for whether the permalink
> structure ends with `/`.
>
> - It looks like it's inconsistent in WP where `%20` (and/or `%20/` ) can
> be added to slugs or structures. It's stripped in some places, but not in
> others.
>
> - ''Should'' a permalink or slug be 'allowed' to contain, or end in, a
> space character? If this is being stripped in some parts of WP, maybe
> that's a good argument to prevent it elsewhere/everywhere. In which case,
> fixing this becomes a lot simpler.

New description:

 #20383 made improvements that strip trailing punctuation from URLs. E.g.,
 https://ma.tt/2012/03/productivity-per-square-inch%20 redirects correctly
 to the canonical URL.

 However, URLs like https://ma.tt/2012/03/productivity-per-square-inch%20/
 (which 'enclose' the trailing space with a trailing slash) are ''not''
 redirected. It, and others like it, typically return a 404 error.

 This kind of 'broken link' pattern is ''extremely'' common on the web;
 particular as a trailing slash is often appended to a malformed URL
 ''before'' WP runs (e.g., via a server/htaccess/nginx configuration).

 We should refine the canonical redirect logic (in `redirect_canonical`) to
 also consider and redirect these types of requests.

 **Considerations**
 - The "''Remove trailing spaces and end punctuation from the path''"
 section of `redirect_canonical` doesn't consider the presence of trailing
 slashes in the URL. This could/should be adapted to catch those.

 - There might be cases where a user 'legitimately' has a permalink
 structure (or slug) that ends in `%20` or `%20/`. That might(?) make a fix
 more complicated than just sniffing for whether the permalink structure
 ends with `/`.

 - It looks like it's inconsistent in WP where `%20` (and/or `%20/` ) can
 be added to slugs or structures. It's stripped in some places, but not in
 others.

 - ''Should'' a permalink or slug be 'allowed' to contain, or end in, a
 space character? If this is being stripped in some parts of WP, maybe
 that's a good argument to prevent it elsewhere/everywhere. In which case,
 fixing this becomes a lot simpler.

--

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/52865#comment:1>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list