[wp-trac] [WordPress Trac] #52865: Strip 'enclosed' trailing spaces in URLs

WordPress Trac noreply at wordpress.org
Fri Mar 19 10:32:40 UTC 2021


#52865: Strip 'enclosed' trailing spaces in URLs
----------------------------+-----------------------------
 Reporter:  jonoaldersonwp  |      Owner:  (none)
     Type:  enhancement     |     Status:  new
 Priority:  low             |  Milestone:  Awaiting Review
Component:  Canonical       |    Version:
 Severity:  normal          |   Keywords:  seo performance
  Focuses:                  |
----------------------------+-----------------------------
 https://core.trac.wordpress.org/ticket/20383 made improvements that strip
 trailing punctuation from URLs. E.g., https://ma.tt/2012/03/productivity-
 per-square-inch%20 redirects correctly to the canonical URL.

 However, URLs like https://ma.tt/2012/03/productivity-per-square-inch%20/
 (which 'enclose' the trailing space with a trailing slash) are ''not''
 redirected. It, and others like it, typically return a 404 error.

 This kind of 'broken link' pattern is ''extremely'' common on the web;
 particular as a trailing slash is often appended to a malformed URL
 ''before'' WP runs (e.g., via a server/htaccess/nginx configuration).

 We should refine the canonical redirect logic (in `redirect_canonical`) to
 also consider and redirect these types of requests.

 **Considerations**
 - The "''Remove trailing spaces and end punctuation from the path''"
 section of `redirect_canonical` doesn't consider the presence of trailing
 slashes in the URL. This could/should be adapted to catch those.

 - There might be cases where a user 'legitimately' has a permalink
 structure (or slug) that ends in `%20` or `%20/`. That might(?) make a fix
 more complicated than just sniffing for whether the permalink structure
 ends with `/`.

 - It looks like it's inconsistent in WP where `%20` (and/or `%20/` ) can
 be added to slugs or structures. It's stripped in some places, but not in
 others.

 - ''Should'' a permalink or slug be 'allowed' to contain, or end in, a
 space character? If this is being stripped in some parts of WP, maybe
 that's a good argument to prevent it elsewhere/everywhere. In which case,
 fixing this becomes a lot simpler.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/52865>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list