[wp-trac] [WordPress Trac] #36384: Percent sign breaks the slugs (sanitize_title > remove_accents bug)

WordPress Trac noreply at wordpress.org
Thu Mar 31 14:35:49 UTC 2016


#36384: Percent sign breaks the slugs (sanitize_title > remove_accents bug)
----------------------------+-----------------------------
 Reporter:  barisunver      |      Owner:
     Type:  defect (bug)    |     Status:  new
 Priority:  normal          |  Milestone:  Awaiting Review
Component:  Editor          |    Version:  4.4.2
 Severity:  normal          |   Keywords:
  Focuses:  administration  |
----------------------------+-----------------------------
 Hi everyone. Just noticed that the `remove_accents()` function (that's
 used in the `sanitize_title()` function) treats the percent sign (followed
 with numbers) as "looks like this is URL-encoded, let me decode it".

 In the Turkish language (and apparently in Persian as well, according to
 Wikipedia) the percent sign ''precedes'' the numbers instead of following
 them. Combine this information with `remove_accents()`'s bug and this
 title:

 {{{
 My body is %50 muscle, %20 fat and %100 sexy
 }}}

 produces this slug:

 {{{
 my-body-is-P-muscle- -fat-and-0-sexy
 }}}

 Here's what's scary, though: The "`-and-0-sexy`" part has a hidden UTF-8
 character (equivalent to `%10`), breaking the post URL altogether:

 [[Image(http://i.imgur.com/3LfCq4c.png)]]

 (You can get the same results with any online URL decoder, by the way.)

 According to my searches, this issue came up once more in Trac (#32462)
 but it was thought it's an IIS-related situation and has never been
 resolved. Now that we know it's `remove_accents()`'s fault, do you think
 we can fix it?

 I'm no expert on PHP, but I believe before dealing with the characters,
 the `remove_accents()` function could just remove the percent characters
 (or replace it with `__('percent')`, but removing would make more sense)
 before dealing with all the other characters.

 PS: Although I'm not sure it was the same problem, this issue about
 percent signs in slugs seems to have been fixed over 10 years ago (#569)
 but kind of resurfaced again. They solved it by removing the percent
 characters before dealing with all the other characters. (Old people know
 the best.)

 Cheers,
 Barış Ünver

--
Ticket URL: <https://core.trac.wordpress.org/ticket/36384>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list