[wp-trac] [WordPress Trac] #44386: Problem with utf8mb4_unicode_ci collation for arabic content

WordPress Trac noreply at wordpress.org
Sun Feb 1 13:08:41 UTC 2026


#44386: Problem with utf8mb4_unicode_ci collation for arabic content
-------------------------+------------------------------
 Reporter:  array064     |       Owner:  (none)
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:  Awaiting Review
Component:  Database     |     Version:  4.9.6
 Severity:  major        |  Resolution:
 Keywords:  2nd-opinion  |     Focuses:
-------------------------+------------------------------
Changes (by sajib1223):

 * keywords:  needs-patch => 2nd-opinion


Comment:

 While I successfully reproduced the reported issue with Arabic content and
 `utf8mb4_unicode_ci` collation, the original reporter's assessment appears
 correct - this seems to be MySQL collation behavior rather than a
 WordPress core bug.

 The issue occurs at the MySQL level when comparing/sorting Arabic text
 with `utf8mb4_unicode_ci`. Using `utf8mb4_general_ci` works around the
 issue, though this may not be the ideal solution as `utf8mb4_unicode_ci`
 is generally recommended for non-Latin scripts according to MySQL
 documentation.

 Before proceeding with any patch, we need guidance from component
 maintainers on:
 1. Whether this is something WordPress core should address
 2. If WordPress can/should work around MySQL collation behavior
 3. Whether this should be documented as a known limitation

 Removing `needs-patch` and adding `2nd-opinion` to get input from database
 component maintainers on the appropriate path forward.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/44386#comment:4>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list