[wp-trac] [WordPress Trac] #21212: MySQL tables should use utf8mb4 character set

WordPress Trac noreply at wordpress.org
Mon Jan 12 10:22:04 UTC 2015


#21212: MySQL tables should use utf8mb4 character set
----------------------------+-----------------------
 Reporter:  pento           |       Owner:
     Type:  task (blessed)  |      Status:  reopened
 Priority:  normal          |   Milestone:  4.2
Component:  Database        |     Version:  3.4.1
 Severity:  normal          |  Resolution:
 Keywords:                  |     Focuses:
----------------------------+-----------------------

Comment (by pento):

 In [attachment:21212.2.diff], wpdb now converts emoji into HTML entities,
 but only for `post_content`, and only when `post_content` is `utf8`.

 This expands emoji support to 99%+ of WordPress sites.

 The good news is, I don't think we don't need to worry about converting
 back to proper characters, if the site later updates to `utf8mb4`. The
 HTML entities will still continue to render correctly, and will
 automatically be updated to the proper character if the post is ever
 edited.

 Still to do:

 * Think about what other columns we want to allow emoji in, and whitelist
 them.
 * Make sure that we're only running the `ALTER TABLE` queries on `utf8`
 tables. `utf8` is the only character set we can guarantee is an unaltered
 subset of `utf8mb4`.
 * Test that the `ALTER TABLE` works correctly for all indexes, with a wide
 range of data.
 * Probably other things.

 > So, should we not alter numeric/datetime columns?

 No need to worry. The `ALTER TABLE` only affects text-based columns, it
 won't touch numeric/datetime columns.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/21212#comment:61>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list