[wp-trac] [WordPress Trac] #44386: Problem with utf8mb4_unicode_ci collation for arabic content
WordPress Trac
noreply at wordpress.org
Sun Jun 17 06:52:44 UTC 2018
#44386: Problem with utf8mb4_unicode_ci collation for arabic content
-------------------------+-----------------------------
Reporter: array064 | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Database | Version: 4.9.6
Severity: major | Keywords: needs-testing
Focuses: |
-------------------------+-----------------------------
I see that since version 4.6, WordPress uses utf8mb4_unicode_ci as the
default collation. I see this in the determine_charset function in the
/wp-includes/wp-db.php file (CMIIW).
In my experience, it looks like utf8mb4_unicode_ci has problems with
content that uses arabic letters.
Example:
I created a tag with the name:
{{{
#!span style="font-size: 28pt"
ٱللَّهِ
}}}
And I created another tag with the name:
{{{
#!span style="font-size: 28pt"
ٱللَّهُ
}}}
Then when I do a tag search (via wp-admin), with keyword:
{{{
#!span style="font-size: 28pt"
ٱللَّهُ
}}}
the search results that appear are:
{{{
#!span style="font-size: 28pt"
ٱللَّهِ
}}}
and
{{{
#!span style="font-size: 28pt"
ٱللَّهُ
}}}
tags. Whereas it should appear only tag:
{{{
#!span style="font-size: 28pt"
ٱللَّهُ
}}}
according to the search keyword.
This becomes a problem when a post wants to use the tag
{{{
#!span style="font-size: 28pt"
ٱللَّهُ
}}}
, but can not be due to existing tag
{{{
#!span style="font-size: 28pt"
ٱللَّهِ
}}}
My guess is not a bug from WordPress, but a bug from MySQL.
For information, perhaps this link is a related issue:
[https://bugs.mysql.com/bug.php?id=76218]
(CMIIW).
--
Ticket URL: <https://core.trac.wordpress.org/ticket/44386>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list