[wp-trac] [WordPress Trac] #55189: Automatic removal of "Zero-width non-joiner" in URL

WordPress Trac noreply at wordpress.org
Sat Mar 5 00:36:20 UTC 2022


#55189: Automatic removal of "Zero-width non-joiner" in URL
--------------------------+---------------------
 Reporter:  man4toman     |       Owner:  (none)
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  5.9.2
Component:  Permalinks    |     Version:  5.9
 Severity:  critical      |  Resolution:
 Keywords:  2nd-opinion   |     Focuses:
--------------------------+---------------------

Comment (by ironprogrammer):

 #50924 focused on the unsightly encoding of ZWNJ in the sample paths (e.g.
 `/پیامک%e2%80%8cها/`). The update provided by #47912 addresses that, but
 does so in a heavy-handed way: it strips them all out, disregarding
 context.

 As @SergeyBiryukov noted, #4328 has the potential to address the
 SEO/redirect concern reported. Though the age of that ticket isn't
 encouraging 😔 A simpler short-term fix could help impacted site owners.

 Also, apart from the [important] redirect concern, is there a concern
 about how this change to slugs may impact the "meaning" of paths in URLs?
 How important is it that the intended word(s) be maintained in the slug?
 [https://www.searchenginejournal.com/technical-seo/url-structure/ It's
 suggested that keywords in URLs these days don't impact SERPs], but that
 doesn't mean the words in paths should be inaccurate.

 ----

 For example, to expand on @man4toman's report, removing ZWNJ from slugs
 (or a `-` replacement) can change the meaning of the words used*:

   "Word"
   کلمه‌ای

 Removal of ZWNJ:

   "Cabbages"
   کلمهای

 In a post:

 [[Image(https://cldup.com/0OmYevLiq0.thumb.jpg, 50%)]]
 ''Post title is changed to a different word in slug.''

 This can also impact categories and tags, for example:

 [[Image(https://cldup.com/5kjEmgVY0Q.png, 75%)]]
 ''Potentially misleading category slug.''

 ''(*My apologies to Persian speakers for any translation inaccuracies
 above 😅. I used DuckDuckGo's translator for the meanings to illustrate
 this point.)''

 ----

 Related: According to Iran's NIC,
 [https://www.nic.ir/idn/#Rules_and_Restrictions IRNIC], the use of ZWNJ is
 acceptable in Persian domain names.

 I couldn't find domain rules related to ZWNJ in Arabic, but understand its
 use is less common than in Persian.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/55189#comment:5>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list