[wp-trac] [WordPress Trac] #48285: wp-config-sample.php should default to `utf8mb4` instead of `utf8` character set
WordPress Trac
noreply at wordpress.org
Tue Sep 27 22:57:02 UTC 2022
#48285: wp-config-sample.php should default to `utf8mb4` instead of `utf8`
character set
-------------------------+------------------------------
Reporter: bchecketts | Owner: (none)
Type: enhancement | Status: assigned
Priority: normal | Milestone: Awaiting Review
Component: Database | Version: 5.3
Severity: minor | Resolution:
Keywords: has-patch | Focuses:
-------------------------+------------------------------
Comment (by SergeyBiryukov):
Replying to [comment:3 JavierCasares]:
> Right now, with the latest MySQL 8.0 and MariaDB 10.6 versions, there is
no "utf8" because hey changed it for "utf8mb3".
Thanks! This should now be addressed in #53623.
> If we want to support all the international language charset, WordPress
should support by default "utf8mb4" (supported by all WordPress-SQL
supported databases).
As noted in comment:1 and comment:2, WordPress does automatically upgrade
to `utf8mb4` when possible.
Replying to [comment:8 JavierCasares]:
> Based on this, yes, new versions of WordPress may have utf8mb4 by
default.
I might be missing something, but as noted in comment:7, WordPress still
has MySQL 5.0 as a minimum requirement at this time, which did not include
`utf8mb4`. So it looks like until the minimum version is bumped to MySQL
5.5, it is neither safe nor required to change the default charset in `wp-
config-sample.php`.
On a related note, reading the MariaDB ticket
[https://jira.mariadb.org/browse/MDEV-8334 MDEV-8334 Rename utf8 to
utf8mb3]:
> In long terms we want the name `utf8` mean the full featured UTF-8.
> We'll do a few preparatory steps:
>
> 1. Change the main name of the 3-byte character set from `utf8` to
`utf8mb3` and make `utf8` alias for `utf8mb3`. This will change all `SHOW`
and `INFORMATION_SCHEMA` output to display `utf8mb3` instead of `utf8`, as
well as change `mysqldump` to dump `utf8mb3` instead of just `utf8`.
> 2. Add a new server option, say `--utf8-is-utf8mb3`, which will be
`true` by default, but the DBA will be able to change it to false and thus
make `utf8` mean `utf8mb4`.
> 3. A few releases later we'll change `--utf8-is-utf8mb3` to be `false`
by default.
>
> Or
>
> 2. Do not add any new server options and
> 3. Add a new `old_mode` value for reverting `utf8` to `utf8mb3` when the
default will mean `utf8mb4`.
The latter appears to be [https://mariadb.com/kb/en/mariadb-1061-release-
notes/#character-sets implemented in MariaDB 10.6.1].
Also reading the MySQL note on [https://dev.mysql.com/doc/refman/8.0/en
/charset-unicode-utf8mb3.html The utf8mb3 Character Set (3-Byte UTF-8
Unicode Encoding)]:
> Historically, MySQL has used `utf8` as an alias for `utf8mb3`; beginning
with MySQL 8.0.28, `utf8mb3` is used exclusively in the output of `SHOW`
statements and in Information Schema tables when this character set is
meant.
>
> At some point in the future `utf8` is expected to become a reference to
`utf8mb4`. To avoid ambiguity about the meaning of `utf8`, consider
specifying `utf8mb4` explicitly for character set references instead of
`utf8`.
>
> You should also be aware that the `utf8mb3` character set is deprecated
and you should expect it to be removed in a future MySQL release. Please
use `utf8mb4` instead.
If the long-term goal of both projects is to make `utf8` an alias for
`utf8mb4` as mentioned above, the default charset in `wp-config-
sample.php` may not technically need any changes at all, though it still
might be a good idea to explicitly change it to `utf8mb4` when the minimum
version is bumped to MySQL 5.5.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/48285#comment:10>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list