[wp-trac] [WordPress Trac] #18945: bad url character encoding in Arabic post names and categories
WordPress Trac
wp-trac at lists.automattic.com
Wed Nov 16 08:49:30 UTC 2011
#18945: bad url character encoding in Arabic post names and categories
-------------------------------+------------------------------
Reporter: walid3 | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: General | Version: 3.3
Severity: normal | Resolution:
Keywords: reporter-feedback |
-------------------------------+------------------------------
Comment (by SergeyBiryukov):
Replying to [comment:3 walid3]:
> well, the link have this strange signs almost everywhere
Encoding UTF-8 characters is a part of RFC 3986:
> Non-ASCII characters must first be encoded according to UTF-8 [STD63],
and then each octet of the corresponding UTF-8 sequence must be percent-
encoded to be represented as URI characters.
http://tools.ietf.org/html/rfc3986#page-21 [[BR]]
http://en.wikipedia.org/wiki/Percent-encoding#Current_standard
It's the same for Cyrillic characters, for example. I don't think we can
do anything here.
That said, most browsers decode the URLs to display them in a human-
readable form:
Firefox 8.0, Chrome 15, Opera 11.52, Safari 5.1 show unencoded URLs.
[[BR]]
IE 8, IE 9 show encoded URLs.
See #16496 for making `$sample_permalink_html` human-readable.
I've also checked comment feeds for posts with UTF-8 slugs, and they seem
to work correctly.
--
Ticket URL: <http://core.trac.wordpress.org/ticket/18945#comment:4>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software
More information about the wp-trac
mailing list